Evaluation of WHO immunologic criteria for treatment failure: implications for detection of virologic failure, evolution of drug resistance and choice of second-line therapy in India

Introduction Routine HIV viral load (VL) testing is not available in India. We compared test performance characteristics of immunologic failure (IF) against the gold standard of virologic failure (VF), examined evolution of drug resistance among those who stayed on a failing regimen because they did not meet criteria for IF and assessed implications for second-line therapy. Methods Participants on first-line highly active antiretroviral therapy (HAART) in Bangalore, India, were monitored for 24 months at six-month intervals, with CD4 count, VL and genotype, if VL>1000 copies/ml. Standard WHO criteria were used to define IF; VF was defined as having two consecutive VL>1000 copies/ml or one VL>10,000 copies/ml. Resistance was assessed using standard International AIDS Society-USA (IAS-USA) recommendations. Results Of 522 participants (67.6% male, mean age of 37.5; 85.1% on nevirapine-based and 40.4% on d4T-containing regimens), 57 (10.9%) had VF, 38 (7.3%) had IF and 13 (2.5%) had both VF and IF. The sensitivity of immunologic criteria to detect VF was 22.8%, specificity was 94.6% and positive predictive value was 34.2%. Forty-four participants with VF only continued on their failing first-line regimen; by the end of the study period, 90.9% had M184V, 63.6% had thymidine analogue mutations (TAMs), 34.1% had resistance to tenofovir, and 63.6% had resistance to etravirine. Conclusions WHO IF criteria have low sensitivity for detecting VF, and the presence of IF poorly predicts VF. Relying on CD4 counts leads to unnecessary switches to second-line HAART and continuation of failing regimens, jeopardizing future therapeutic options. Universal access to VL monitoring would avoid costly switches to second-line HAART and preserve future treatment options.


Introduction
An estimated 6.6 million HIV-infected individuals are receiving highly active antiretroviral therapy (HAART) in resourcelimited settings [1]. In India alone, approximately 400,000 individuals have been started on HAART between 2004 and 2011 [2]. Although this rapid scale-up of HAART has led to dramatic declines in mortality, many challenges remain, including optimal monitoring for treatment failure, the threat of emerging drug resistance and the availability of secondline therapy.
Treatment failure can be defined as progression of disease after initiation of HAART. Failure can be assessed by clinical (the appearance of new opportunistic infections, on-going weight loss, etc.), immunologic (a decline in CD4 count) or virologic (a viral rebound above a set threshold of 200 copies/ ml) criteria [3]. It has been well established that virologic failure (VF) precedes immunologic failure (IF), which in turn precedes clinical failure [4]. A detectable viral load (VL) is among the earliest signs of treatment failure, and continuing a failing regimen has been associated with increased morbidity and mortality [5,6] as well as accumulation of drug resistance mutations that further limit second-line options [7]. Therefore, routine virologic monitoring is the standard of care in the developed world [8]. However, because of cost and limited access to laboratory technology, routine VL monitoring is not performed or recommended in resourcelimited settings [9]. Instead, clinicians in these settings rely on defining treatment failure based on CD4 cell counts, clinical manifestations of disease progression, or both. Several studies from resource-limited settings have shown that lack of VL monitoring leads to frequent misclassification of failure and therefore either premature switching to second-line regimens or a delay in change to second-line agents [10Á17].
The aims of this longitudinal analysis were to (1) evaluate the sensitivity, specificity and positive predictive value (PPV) of WHO-proposed criteria for treatment failure based on CD4 count, to detect true treatment failure based on virologic criteria in an Indian setting; and (2) assess the consequences of remaining on a failing regimen among those who do not meet criteria for IF but are, in fact, failing by virologic criteria, by examining the evolution of drug resistance and implications for second-line antiretroviral therapy.

Methods
The data presented in the present study were collected as part of a two-year observational cohort study of adherence to HAART among HIV-infected adults attending public and private clinics in Bangalore, India. The full methods of this study have been published elsewhere [18].

Setting
The study was conducted at three clinics (one public, one private, and one public-private partnership clinic) in Bangalore, India. Bangalore is a major metropolitan city in Karnataka, which is one of the six states in India with a high HIV prevalence [2]. The most commonly prescribed antiretroviral therapy in this setting is a generic nevirapine-based regimen, usually including lamivudine and either zidovudine or stavudine (and, very infrequently, tenofovir). Patients with a concurrent diagnosis of active tuberculosis infection are placed on efavirenz-based regimens because of potential drug interactions with rifampicin. Patients attending the public and the public-private partnership clinic received their HIV medications free of cost [19].
Once patients are on a stable first-line regimen, they are monitored every six months, and treatment monitoring includes weight, haemoglobin, alanine aminotransferase (ALT), and CD4 count. Routine VL monitoring is not available. Patients suspected of having treatment failure for WHOdefined immunologic and clinical criteria are referred to select government HIV treatment centres designated as Centers for Excellence for evaluation for second-line therapy [20]. Beginning in 2011, recommendations of the National AIDS Control Organization (NACO) of India were to obtain targeted VL testing for those with suspected treatment failure to avoid unnecessary switches to second-line therapy. The recommended second-line regimen is a combination of zidovudine or tenofovir, lamivudine, and ritonavir-boosted atazanavir.

Participants
Participants were eligible if they were 18 years or older, were HIV infected, were on antiretroviral therapy for at least one month (and therefore not HAART naïve) at the time of enrolment, and were willing to participate in all follow-up visits. A total of 533 consecutive patients seeking care at the three clinic sites in Bangalore who met eligibility criteria were enrolled and followed in the study between August 2007 and December 2011. In this analysis, we included the 522 patients who had CD4 and VL data available at time points six or more months after starting HAART.

Procedures
The study was approved by the institutional review boards of the University of CaliforniaÁSan Francisco and St John's National Academy of Health Sciences.
Potential participants were referred to the study interviewers by clinic staff. Following referral, participants were taken to a separate room for eligibility confirmation and the study interview. If found eligible, participants were monitored every 3 months for 24 months and were interviewed to assess adherence. Blood work was done every six months. All patients provided written informed consent.
Participants' blood was drawn by trained staff phlebotomists. CD4 cell count, HIV plasma VL, HIV genotype, and if VL 1000 copies/ml were assessed at six-month intervals. CD4 cell counts were performed on whole-blood specimens collected in an EDTA (ethylenediaminetetraacetic acid) tube using a single-platform flow cytometry assay (PCA system; Guava Technologies Inc., Hayward, CA, USA). A real-time polymerase chain reaction (PCR) assay with a fluoresceinlabelled TaqMan probe was used for the quantitation of VL. The test was developed and its performance characteristics were determined at Molecular Diagnostics and Genetics, Reliance Life Sciences (Mumbai, India). This assay can reliably detect an HIV RNA level of 100 copies/ml of blood. The HIV-1 drug resistance assay was performed at YRGCARE Infectious Diseases Laboratory (Chennai, India) using an in-house method [21]. The performance characteristics of this method have been validated [22], and the assay is certified by the TREAT Asia Quality Assurance (TAQAS) Program (National Serology Reference Laboratory, Australia). Of note, we did not have baseline genotypes on these participants. Furthermore, we did not assess for the presence of protease inhibitor (PI)-associated mutations because PIs were very rarely used in the setting at the time of the study.
The results of the VL testing were provided to the patients and their treating clinicians. Providers then were able to use these results as well as other relevant information to make treatment-related decisions for each patient, such as clinical data; national guidelines at the time the study was conducted; and, for patients in the private clinic setting, their ability to pay for second-line therapy. These physicians may have been limited in their ability to refer to second-line therapy because second-line therapy was available only through small pilot programmes in a few centrally located care settings in India.

Definitions
We used standard WHO-defined criteria for IF in this analysis since these are the same criteria adopted by NACO in India. A participant was categorized as having IF if he or she met one of the following three criteria after at least six months on HAART: (1) fall of CD4 count to pre-therapy baseline or below, (2) ]50% fall of absolute CD4 count from the on-treatment peak value or (3) persistent CD4 levels below 100 cells/mm 3 .
VF was defined as, after at least six months on HAART, having a VL greater than 1000 copies/ml at two consecutive measurements (six months apart at study visits) or one VL measure10,000 copies/ml.

Data analysis
Descriptive statistics includes means, standard deviations, medians and proportions when appropriate. Performance of the immunologic criteria was assessed by calculating their sensitivity, specificity and PPV, with exact confidence intervals, using VF as the ''gold standard.'' These analyses were performed both on the sample as a whole, and for males and females separately. One participant who identified as transgender was included in the analyses of the whole sample, but excluded from the analyses by gender. Differences between men and women were assessed via a t-test (age), Mann-Whitney U-test (CD4 and time on HAART), or x 2 -test (categorical variables). Due to small numbers, no gender-specific analyses were done on the genotyping data. Analyses were done in SPSS, version 18.0.2 (SPSS Inc., Chicago, IL, USA), and Stata, version 12.1 (StataCorp LP, College Station, TX, USA).

Results
Five-hundred and twenty-two participants were followed for a median of 24 months. Two-thirds of participants were male (n0353; 67.6%), and the mean age was 37.5 (standard deviation (SD): 8.5) ( Table 1). At the time of study enrolment, participants had been on HAART for a median of 17 months (interquartile range (IQR): 6Á30 months) and median CD4 at that time was 333 (IQR: 210Á470). A vast majority of participants (n 0444, 85.1%) were on nevirapine-based regimens, and 40.4% (n 0211) were on d4T-containing regimens. Seventy-seven per cent of participants were on their firstever ART regimen; the remaining were also on first-line agents, but may have undergone some treatment modification for toxicity or drug interactions. Female patients were significantly younger (mean age: 34.3 years for females compared to 39.0 years for males) and had a higher median CD4 count at study enrolment (362 for females versus 315 for males) ( Table 1).
A total of 57 (10.9%) patients experienced VF during the study; 13 of the 57 patients met criteria for IF at the time VF was detected, leaving 44 patients with VF only (Table 2). An additional 25 patients (4.8%) met criteria only for IF based on one or more of three WHO-defined criteria for IF. The sensitivity of the WHO IF criteria to detect VF was therefore 22.8% (95% confidence interval (CI): 12.7%Á35.8%), the specificity was 94.6% (95% CI: 92.2%Á96.5%), and the PPV was 34.2% (95% CI: 19.6%Á51.4%). There were no differences in the occurrence of VF by gender (40/353 men and 17/168 women). Sensitivity, specificity and PPV also did not differ by gender. Of the 44 total patients with VF only, 19 (43.2%) eventually developed IF during the study follow-up period. Median time from VF to IF in this subset of patients was 18 months. We assessed the evolution of drug resistance in these 44 patients between the time VF was detected and either the end of the study period or the time when IF was detected, when presumably the patient would be referred for second-line therapy. Twenty-nine of these 44 patients already had VF at enrolment into the study and had been on therapy for a median of 25 months at the time of their initial study visit (Table 3). A majority (86.2%; n 025) of participants with VF at their baseline visit had an M184V mutation. Nearly onehalf (48.3%; n 014) of these participants had at least one TAM: 20.7% (n 06) had one TAM, 17.2% (n 05) had two TAMs, and 10.3% (n 03) had three or more TAMs, indicating that they may have been failing for many months by the time they enrolled in the study. Additionally, two participants were found to have the 69 insertion site mutation, and four more patients had K65R or K70E. Six of the 29 participants (20.7%) had evidence of genotypic resistance to tenofovir. With regard to NNRTI mutations, more than one-half of the participants with VF at enrolment (51.7%; n 015) had Y181C owing to the prevalence of nevirapine use in first-line regimens. Seven (24.1%) had K103N, seven (24.1%) had G190A and six (20.7%) had K101E. All but two patients (93.1%) were resistant to both nevirapine and efavirenz. Based on the NNRTI mutations that were present, and using the weighted genotypic score for the determination of etravirine susceptibility as outlined above, 18 of the 29 participants (62.1%) had a calculated etravirine score of 2.5 or greater, corresponding to intermediate-or high-level resistance to etravirine.
Fifteen participants developed VF during the study. The only NRTI mutation detected in any of these participants was M184V, with eight of the 15 patients (53.3%) harbouring this mutation. None of the participants who developed VF during the study were resistant to tenofovir at the time that treatment failure was detected. There were a few NNRTI mutations present: two patients had Y181C, two had K103N, five had G190A and one had K101E. A total of nine (60.0%) were resistant to nevirapine and efavirenz. However, only two patients (13.3%) had an etravirine mutation score of ]2.5.  The 44 participants with VF only and without evidence of IF (includes 29 with VF at baseline and the 15 who developed VF during the study) were followed for a median of 18 months (either until the end of the study or until they developed evidence of IF) while they remained on a failing regimen. During this time, seven additional participants developed M184V, three developed 69 insertion site mutations and two developed Q151M. Twenty additional TAMs also developed in the 44 participants. When the evolution of NNRTI mutations was assessed, three more participants developed K103N, two more developed Y181C, and six more each developed G190A and K101E. By the end of the study period, 34.1% of participants (n 015) were found to have mutations that would confer high-level resistance to tenofovir. Based on the NNRTI mutations present, 63.6% (n 028) of participants had etravirine mutation scores ]2.5, associated with intermediate-or high-level resistance to etravirine.
Mutation patterns among the 13 participants with VF and IF detected at the same time were also examined and found to be comparable to the mutation patterns at the end of the study for the 44 participants with VF but no IF: 9 of the 13 participants (69.2%) had the M184V mutation; six (46.2%) had any TAMs, with a majority having 3 or more TAMs; 8 (61.5%) had predicted resistance to tenofovir; and 9 (69.2%) had an etravirine mutation score ]2.5.

Discussion
In this longitudinal study of 522 HIV-infected participants on first-line HAART in Bangalore, India, we found that WHOdefined criteria for IF are insensitive at detecting virologic treatment failure and, more importantly, that the delay in diagnosis of treatment failure results in accumulation of a substantial number of drug-resistant mutations and has the potential to have a significant impact on the efficacy of second-line antiretroviral treatment.
Participants who developed VF during the study period, and thus were recognized as having treatment failure immediately, did not have any mutations associated with tenofovir resistance at the time of VF. However, by the end of a follow-up period of a median of 18 months, when participants finally developed evidence of IF, one in three participants had mutations that conferred high-level resistance to tenofovir. Subtype C HIV-1 virus is predominant in India, and it has been previously established that in this subtype, K65R is selected for by on-going exposure to d4T and leads to tenofovir cross-resistance [25]. Although NACO is now moving towards phasing out the use of d4T, nearly half of the participants in our study were on d4T-containing regimens. On-going exposure to thymidine NRTIs, including d4T and azidothymidine (AZT), in the setting of on-going viral replication contributed to further accumulation of K65R and TAMs, resulting in substantial levels of resistance to tenofovir. Although tenofovir resistance was not specifically addressed, studies from northern India and Kenya among patients failing first-line HAART also found similarly high rates of both NRTI and NNRTI resistance [26,27].
Furthermore, based on the presence of TAMs and other NRTI mutations, a substantial proportion of individuals in our cohort also had complete resistance to AZT and d4T, essentially making the whole NRTI class of drugs a nonviable option for a significant number of patients needing secondline therapy. Similar concerns have been reported in studies from Africa, where 17%Á53% of patients with treatment failure had no active NRTI options [28Á30]. This is especially alarming given the extremely limited access to second-line regimens. Salvaging patients with complete NRTI resistance could potentially require the use of newer agents such as raltegravir, maraviroc, etravirine and darunavir in combination with one another [31,32]; however, none of these drugs are widely available in these settings.
According to India's 2011 national guidelines, when patients meet the criteria for IF, they are tested to confirm VF and then empirically started on ritonavir-boosted atazanavir, tenofovir and lamivudine. In nearly one-third of patients, this would mean that they will be receiving PI monotherapy since the other two drugs in the regimen are completely inactive or only partially active. Although a recent AIDS Clinical Trials Group (ACTG) study has shown some promise with lopinavir-ritonavir monotherapy in resource-limited settings in the short term [33], this may not a viable long-term option. One report from India on patients with PI-based second-line therapy after firstline therapy failure showed that 72% were still viremic and a substantial proportion developed PI mutations while on PIbased second-line regimens [34]. Nearly one-quarter developed mutations associated with darunavir in this study. Therefore, preserving NRTI options with early detection of VF is imperative to preserve treatment success in resourcelimited settings.
Etravirine is a second-generation NNRTI that has been shown to be effective in achieving virologic suppression when given in combination with other fully active drugs in patients with prior treatment experience [35]. However, treatment with first-generation NNRTIs such as nevirapine, especially if there is on-going replication in the setting of VF, may lead to cross-resistance to etravirine. In participants with incident VF (VF detected after the baseline study visit), only 13% had the predicted resistance to etravirine at the time that failure was detected. This is in contrast to the 64% with predicted resistance by the end of the study, when participants also developed criteria for IF. High prevalence of resistance to etravirine among patients failing first-line HAART has previously been reported in India [36]. Therefore, when routine VL monitoring is not available and those with IF have likely been failing for a substantial period, they are expected to have accumulated enough NNRTI mutations to make etravirine a nonviable option. Although etravirine is not being considered by WHO or NACO as part of a second-line regimen, these findings confirm that in a setting where firstline regimens are almost exclusively NNRTI-based regimens and second-line regimens are chosen empirically, etravirine would not be an effective agent.
Our findings regarding the sensitivity of WHO's IF criteria to detect VF are in line with similar studies conducted in Africa, where subtype C virus also predominates. Sensitivity of WHO-defined criteria for IF to detect VF ranged from 21 to 58% [10Á14], depending on the setting in which the study was conducted and the definition used for VF. We believe that the sensitivity we report in this study is on the lower end of the reported range because of its longitudinal nature. Most of the other studies were conducted cross-sectionally, allowing for the possibility that VF had been occurring for a longer duration and hence participants have enough time for IF to develop as well. In our study, we had a mixture of participants, some of whom may have been failing for some time at enrolment, and an additional 15 patients who developed VF during the study period. Since we know that IF develops after VF, we would not expect these 15 participants to have had a chance to develop IF at the time they were initially determined to have treatment failure.
The prevalence of VF was fairly low in our setting: approximately 10% of participants were found to have VF. This contributed to the low PPV of 34%. However, this number is slightly lower than what others have reported: 36.8% in South Africa [10], 39% in a large study in Nigeria [13] and 70% in a government programme in India [12]. If IF criteria were used to define treatment failure, a substantial number of patients would be unnecessarily switched to more expensive second-line therapy. This adds not only cost to the national programme but also risks for developing resistance to second-line agents for those who do not have access to third-line agents. Fortunately, starting in 2010, WHO has started recommending targeted VL testing for those suspected of having IF to avoid switching patients with suppressed VLs to second-line therapy. NACO also endorsed these guidelines in 2011.
There are several limitations to this study. Patients were not HAART naïve at the time of enrolment, and we did not have baseline genotypes to determine the contribution of transmitted drug resistance to, or the impact of previous treatment regimens on, virologic treatment failure. However, studies from other parts of India show that the prevalence of drug resistance in HAART-naïve patients is between 3% and 5% in south India [37Á39]. Second, because patients were enrolled into the study at any point one or more months after initiating HAART and the median time on treatment at study enrolment was 17 months, some participants had VF that may have been on-going for many months by the time the first VL was obtained. Because of this limitation, we were not able to calculate the rate of accumulation of mutations, since it would not be the same for those who had been failing for some time and those who developed VF during the study. Finally, we did not look for PI resistance when we conducted our genotypes because of the cost associated with performing PI resistance analysis. However, given that exposure to PIs is extremely low, and none of our patients were on PI-based first-line regimens, we do not think that this is a significant drawback of the study.

Conclusions
In resource-limited settings, HIV drug resistance is a threat to the long-term effectiveness of antiretroviral treatment. It is unlikely that resistance testing will be available at the level of the individual patient in the near future. Additionally, treatment changes based on individual patient testing are not feasible due to limited access to second-and third-line agents. The best way to ensure treatment success with second-line agents may be through early detection of treat-ment failure. WHO IF criteria have low sensitivity for detecting VF, and the presence of IF poorly predicts VF. Relying on CD4 count data alone would lead to a substantial number of unnecessary switches to second-line HAART. Although there are some conflicting studies [40], routine VL monitoring has been shown to be cost saving in resourcelimited settings, especially when unnecessary treatment switches are taken into account [41]. A notable proportion of patients would be continued on first-line therapy that they are already failing, allowing for accumulation of drug resistance and, possibly, subsequent transmission of resistance to new partners [42,43]. Universal access to VL monitoring has the potential to have enormous public health impact because it would help avoid costly switches to second-line HAART, limit the development of drug resistance and transmission, and preserve future therapeutic options.