Gp41 and Gag amino acids linked to HIV‐1 protease inhibitor‐based second‐line failure in HIV‐1 subtype A from Western Kenya

Abstract Introduction Failure of protease‐inhibitor (PI)‐based second‐line antiretroviral therapy (ART) with medication adherence but no protease drug resistance mutations (DRMs) is not well understood. This study investigated the involvement of gp41 and gag as alternative mechanisms, not captured by conventional resistance testing and particularly relevant in resource‐limited settings where third‐line ART is limited. Methods We evaluated gp41 and gag for unique amino acids in seven subtype A infected Kenyans failing second‐line therapy with no PI resistance yet detectable lopinavir (query dataset), compared to seven similar‐setting patients with PI resistance or undetectable lopinavir and 69 publically available subtype A Kenyan whole‐genomes sequences. Results Three gp41 (607T, 641L, 721I) and four gag (124S, 143V, 339P, 357S) amino acids were significantly more frequent in the query dataset compared to the other datasets, with significantly high co‐occurrence. Conclusion The genotypic analysis of a unique group of HIV‐1 subtype A infected patients, identified seven amino acids that could potentially contribute to a multi‐gene mechanism of PI‐based ART failure in the absence of PI DR mutations.


| INTRODUCTION
The main cause of HIV-1 antiretroviral therapy (ART) failure is drug resistance (DR) development. In resource-limited settings (RLS) there are only two recommended ART lines: a nonnucleoside reverse transcriptase inhibitor (NNRTI) based firstline, and a boosted protease inhibitor (PI) based second-line, each with two nucleoside reverse transcriptase inhibitors (NRTIs). However, already observed second-line failure is estimated to escalate with limited third-line options [1]. Detection and understanding second-line DR in RLS is therefore important to sustain treatment efficacy and plan for effective future therapy.
Failure of second-line ART in RLS, typically based on lopinavir/ritonavir (LPV/r), is seen at an average of 30% [1]. Known PI DR mechanisms involve the development of mutations in and around the protease active site that change its interaction with the inhibitor [2]. Such changes typically lead to fitness costs and loss of replication capacity that can be compensated by more distant protease mutations [3]. However, whereas >70% of patients who fail first/second-line failure have reverse transcriptase (RT) DR mutations (DRMs), patients failing second-line ART have low levels (average 18%) of PI DR [4], which is not well understood. This observation suggests alternative DR mechanisms that lead to PI DR, such as (1) non-adherence or decreased PI levels [5,6], (2) resistant minor variants not detected by conventional assays [7], or (3) regions outside the protease such as gag [8], with increasing data suggesting its importance, and the recently suggested gp41 in envelope (env) [9].
To further understand low PI DR upon second-line ART failure and potentially associated alternative genomic regions, we focused on the gp41 region and investigated whether a unique cohort of HIV-1 infected Kenyans failing second-line ART with no protease DRMs, but detectable LPV levels, have amino acids that could potentially be associated with PI DR, while also examining the gag gene.

| METHODS
Plasma samples from Kenyan adults participating in a study at the Academic Model Providing Access to Healthcare (AMPATH) in Eldoret, Kenya [10] were selected if they were (1) infected with HIV-1 subtype A (by pol), the most common subtype in Kenya, (2) failing LPV/r-based second-line ART after >6 months, with prior >6 months NNRTI-based first-line ART, (3) had detectable plasma LPV and (4) had no protease DRMs. This "Query Dataset, " hypothesized to have alternative PI DR mechanisms, was compared to a "Background Dataset, " obtained from AMPATH patients from the same study, eligible for criteria (1) to (3) above, but with PI DRMs or with undetectable LPV levels, in which alternative PI DR mechanisms were less likely. At the time of this study monitoring of patients on ART was mostly immunological-or clinical-based. VL or drug resistance testing was limited and therefore the precise cause of treatment failure was unknown.
CD4 (FACSCalibur platform; BD-Biosciences San Jose, CA) and viral load testing (Amplicor; Roche Molecular, Pleasanton, CA) were performed at the AMPATH Laboratory; pol genotyping at the Providence-Boston Center for AIDS Research; and LPV levels (liquid chromatography/tandem mass spectrometry) at the University of North Carolina Center for AIDS Research.
To compensate for the small sample sizes of the Query and Background Datasets and to further evaluate the presence of unique gp41 and/or gag amino acids associated with PI DR, two additional Kenyan, HIV-1 subtype A, full-genome datasets were compiled from the Los Alamos Database (http://www.hiv.lanl. gov): (1) an "ART-Na€ ıve Dataset", sequences from literatureconfirmed ART-na€ ıve patients; and (2) a "Population Dataset", all remaining sequences, from patients with unclear ART histories but mostly from earlier (<2002) studies before PI introduction to Kenya. Insignificant presence of unique gp41 and/or gag amino acids in these datasets as compared to the Query Dataset would support their potential involvement in alternative PI DR mechanisms, rather than being subtype-specific polymorphisms or evolutionary-associated changes.
We hypothesized that Query Dataset viruses have unique amino acids outside pol that may be associated with PI failure, and that these are not common in patients from the same cohort with PI DRMs or lacking adherence as represented by undetectable LPV (Background Dataset), or in the Na€ ıve or Population Datasets. To test this hypothesis we first identified gp41 and/or gag signature amino acids that differ between the Query and the Background Datasets, which are from the same Kenyan setting. The Los Alamos Viral Epidemiology Signature Pattern Analysis (VESPA) tool was used to identify signature amino acids, defined as positions in which most common amino acids differ among two datasets. A threshold of zero was used, selecting the common amino acid regardless of its frequency. Second, we identified which of these signature amino acids also differ between the Query and the Na€ ıve, and the Query and Population Datasets. Third, we compared proportions of these signature amino acids between the Query Dataset and the three other datasets using Fisher exact tests (p<0.05 considered significant). Lastly, we did similar comparisons between the Background Dataset and the Na€ ıve and Population Datasets, to explore the presence of these amino acids within the Kenyan population.

| RESULTS
Samples from 14 eligible patients were available (median CD4 count 115 cells/ll and viral load 48,800 copies/ml), seven each in the Query and Background Datasets ( Table 1). All Query Dataset patients had NRTI and/or NNRTI, but no PI DRMs despite detectable LPV. In the Background Dataset three patients had major PI DRMs, two with detectable and one with undetectable LPV; and four had no PI DRMs with undetectable LVP. For comparison, DRMs were minimal in the Na€ ıve Dataset (NNRTI-associated V179T (n=2) and E138A (n=1)); and the Population Dataset (PI-associated M46L (n=1), NRTI-associated K65R (n=1), NNRTI-associated V106I (n=1) and E138A (n=2)).
Generated sequences included 6/7 gp41 and 7/7 gag in the Query and 5/7 gp41 and 6/7 gag in the Background Datasets. Phylogenetic analysis of all gp41 and gag sequences confirmed HIV-1 subtype A designation, with no epidemiologic linkage (data not shown).
VESPA identified 24 gp41 positions where amino acids differed between the Query and Background Datasets. Of those 24, 14 also differed between the Query and Na€ ıve Datasets. At three of the 14 positions the amino acids were significantly higher in the Query compared to the Background (607T) or the Na€ ıve Dataset (607T, 641L, 721I). These three amino acids were also signature in the Query compared to the Population Dataset, two significantly higher (607T, 721I) ( Table 2).
Similar gag analyses resulted in 17 positions identified by VESPA where common amino acids differed between the Query and Background Datasets, 11/17 also different between the Query and Naive Datasets. At four of these 11 sites amino acids were significantly higher in the Query compared to the Na€ ıve Dataset (124S, 143V, 339P and 357S), and although they were not significantly more prevalent compared to the Background and Population Datasets they were selected for further analysis (Table 2). These four amino acids were also signature in the Query compared to the Population Dataset. Gag was also examined for previously described HIV-1 subtype B-specific cleavage site (CS: MA/CA 128I/T/A, NC/p1 431V, 436E/R and 437T/V; and in p1/p6-gag 449F/P/V, 452S/K and 453A/L/T) and non-CS (R76K, Y79F and T81A) mutations, associated with either PI DR or exposure [8]. Although three CS mutations (128del, 436R and 449P) and two non-CS mutations (76K and 79F) were present in the Query Dataset they were similarly common (>30%) and not significantly different from any other dataset, and 449P was conserved in all sequences from the four datasets suggesting a subtype A-associated gag polymorphism. The occurrence of previously described gag CS and non-CS mutations in the Na€ ıve Dataset, further highlight the possible role of subtype-specific gag variability and the potential for multiple pathways to PI failure.
In similar comparisons of the Background and Na€ ıve, or the Background and Population Datasets, four of the four gag mutations and two of the three gp41 mutations did not differ between the datasets, supporting our hypothesis. One mutation, gp41 641L, occurred more in the Background (2/5) than in the Na€ ıve Dataset (1/31; p=0.04). One of these two Background Dataset patients had a detectable LPV level with three-class DR, still potentially supporting the alternative DR mechanism hypothesis. The other patient had an undetectable LPV level and no PI DR, suggesting the need for larger numbers and further characterization.

| DISCUSSION
This study investigated alternative DR mechanisms in a small but unique Kenyan cohort of HIV-1 subtype A infected adults failing PI-based second-line ART with a particular focus on gp41. We identified three novel amino acids that were significantly more prevalent in individuals with no protease DRMs but detectable LPV levels, as compared to ART na€ ıve Kenyans: two in the heptad repeat region (607T, 641L) and one in the cytoplasmic tail (721I) of gp41. We also found a higher prevalence of four gag mutations, one in the matrix (124S) and three in the capsid (143V, 339P, 357S) structural proteins of gag. Though these findings may suggest a potential role for these amino acids in new mechanisms of PI resistance, any hypothesis on their actual involvement in alternative mechanisms is speculative. Current research suggests that PIs do not only block the viral protease activity, but may also affect viral entry that is facilitated by env [9]. In that study gp41 conferred PI resistance, whereas gag and pol had no PI-associated DRMs. The authors' proposed an alternative resistance mechanism that includes amino acids within env that can overcome PI-mediated inhibition of viral entry, by changing the interaction between the cytoplasmic tail of gp41 and the uncleaved gag. One candidate gp41 amino acid identified here is in the cytoplasmic tail further supporting this suggested alternative mechanism. The other two amino acids are in the heptad repeat regions, important in viral fusion. However, as also noted by Rabi et al., identification of amino acids within gp41 that might confer PI resistance is complicated by the high diversity within this region, requiring more genotypic and phenotypic analysis.
Studies regarding the involvement of regions outside the protease in PI DR have shown that gag CS and non-CS mutations might be associated with PI failure with or without PI DRMs, restoring viral replication capacity, increasing viral fitness and improved protease binding affinities for the mutant gag substrate [8,11]. Substitutions in gag have also been linked to reduced PI drug sensitivity in the presence of wild-type protease [12,13], highlighting the need to possibly include gag in drug susceptibility assays. A recent study showed that mutations in gag can develop and accumulate overtime and contribute independently to drug resistance, or lead to further mutation development within protease [14]. Similar investigations including, for example, mutagenetic tree analysis to determine the interconnection between gag and these mutations, would be important, but not performed here due to limited numbers. Notably, most studies addressing genetic variation within gag have focused on HIV-1 subtype B samples, with available data suggesting more variation within non-B subtypes [15]. The new gag amino acids identified here could contribute in a similar way, though this needs to be determined. Further investigation such as phenotypic confirmation of the amino acid contribution is urgently needed to validate our findings and determine the significance, mechanism and impact of these candidate gp41 and gag amino acids in PI-based second-line ART failure. Subtype A was examined here as it is the most common HIV-1 variant in Kenya (approximately 70%) [16]. However, the presence and role of these amino acids in other subtypes as well as their relation to other PIs warrant further investigation. Caveats of this study include, first, the need for phenotypic confirmation of the candidate amino acids' role during PI treatment failure, either as independent or compensatory mutations; second, the availability of only a single LPV level measurement, not adequately reflecting adherence [17]; third, the use of population sequencing without consideration of minor resistance variants, which may interfere with ART effectiveness [7]; and lastly, though patients were unique, samples sizes were small, limiting power and generalizability and necessitating widening of the datasets to Kenyan but non-AMPATH settings, with at times less complete treatment histories and more limited interpretation.

| CONCLUSION
Genotypic analyses of a small yet unique cohort of subtype A infected patients from western Kenya identified seven amino acids in gp41 and gag that could potentially contribute to a multi-gene mechanism of PI-based ART failure in the absence of PI DR mutations. The alternative PI DR pathways investigated here, which require larger numbers and phenotypic validation, will most probably not be exclusively responsible for failure of PI-based second-line ART in the absence of PI DR. However, since HIV-1 gp41 and gag are not routinely analysed phenotypically or genotypically, identification of alternative mechanisms involving these genes is limited. Data presented here propose potential avenues for further investigation of such mechanisms. Such information is required to improve treatment monitoring and DR interpretation, particularly in RLS, as well as to conduct strategic planning for third-line ART options.