Corresponding Author Olivia Keiser, Institute of Social and Preventive Medicine (ISPM), University of Bern, Finkenhubelweg 11, CH-3012 Bern, Switzerland. Tel.:+41 31 631 35 28; Fax: +41 31 631 35 20; E-mail: firstname.lastname@example.org
Objectives To examine the accuracy of the World Health Organization immunological criteria for virological failure of antiretroviral treatment.
Methods Analysis of 10 treatment programmes in Africa and South America that monitor both CD4 cell counts and HIV-1 viral load. Adult patients with at least two CD4 counts and viral load measurements between month 6 and 18 after starting a non-nucleoside reverse transcriptase inhibitor-based regimen were included. WHO immunological criteria include CD4 counts persistently <100 cells/μl, a fall below the baseline CD4 count, or a fall of >50% from the peak value. Virological failure was defined as two measurements ≥10 0000 copies/ml (higher threshold) or ≥500 copies/ml (lower threshold). Measures of accuracy with exact binomial 95% confidence intervals (CI) were calculated.
Results A total of 2009 patients were included. During 1856 person-years of follow up 63 patients met the immunological criteria and 35 patients (higher threshold) and 95 patients (lower threshold) met the virological criteria. Sensitivity [95% confidence interval (CI)] was 17.1% (6.6–33.6%) for the higher and 12.6% (6.7–21.0%) for the lower threshold. Corresponding results for specificity were 97.1% (96.3–97.8%) and 97.3% (96.5–98.0%), for positive predictive value 9.5% (3.6–19.6%) and 19.0% (10.2–30.9%) and for negative predictive value 98.5% (97.9–99.0%) and 95.7% (94.7–96.6%).
Conclusions The positive predictive value of the WHO immunological criteria for virological failure of antiretroviral treatment in resource-limited settings is poor, but the negative predictive value is high. Immunological criteria are more appropriate for ruling out than for ruling in virological failure in resource-limited settings.
Objectifs: Examiner la précision des critères immunologiques de l’OMS sur l’échec virologique du traitement antirétroviral.
Méthodes: Analyse de 10 programmes de traitement en Afrique et en Amérique du Sud, contrôlant le taux de CD4 et la charge virale du VIH-1. Les patients adultes ayant au moins deux mesures du taux de CD4 et de la charge virale entre 6 et 18 mois après le début d’un régime à base d’un inhibiteur non nucléosidique de la transcriptase inverse ont été inclus. Les critères immunologiques de l’OMS comprennent: des taux de CD4 constamment < 100 cellules/ul, une chute au-dessous de la ligne de base des CD4 ou une chute de 50% de la valeur pic. L’échec virologique a été défini comme 2 mesures ≥ 10,0000 copies/ml (seuil supérieur) ou ≥ 500 copies / ml (seuil inférieur). Les mesures de la précision par binomial exact avec un intervalle de confiance (CI) de 95% ont été calculées.
Résultats: 2009 patients ont été inclus. Au cours d’un suivi de 1856 personnes-années, 63 patients ont répondu aux critères immunologiques, 35 patients (seuil supérieur) et 95 patients (seuil inférieur) ont répondu aux critères virologiques. La sensibilité (intervalle de confiance [IC] 95%) était de 17,1% (6,6% -33,6%) pour le seuil supérieur et 12,6% (6,7% -21,0%) pour le seuil inférieur. Les résultats correspondants de spécificitéétaient de 97,1% (96,3% -97,8%) et 97,3% (96,5% -98,0%) pour une valeur prédictive positive de 9,5% (3,6% -19,6%) et 19,0% (10,2% -30,9%) et pour une valeur prédictive négative de 98,5% (97,9% -99,0%) et 95,7% (94,7% -96,6%).
Conclusions: La valeur prédictive positive des critères immunologiques de l’OMS pour l’échec virologique du traitement antirétroviral dans les pays à ressources limitées est faible, mais la valeur prédictive négative est élevée. Les critères immunologiques sont plus appropriés pour exclure que pour confirmer l’échec virologique dans les pays à ressources limitées.
Objetivos: Examinar la precisión de los criterios inmunológicos de la OMS para fallo virológico del tratamiento antirretroviral.
Métodos: Análisis de 10 programas de tratamiento en África y Sudamérica que monitorizan tanto los conteos de CD4 como las cartas virales de VIH-1. Se incluyeron pacientes adultos con al menos dos conteos de CD4 y medidas de carga viral entre los meses 6 y 18 posteriores al comienzo de un régimen con inhibidores de la transcriptasa reversa no-nucleotídicos. Los criterios inmunológicos de la OMS incluyen conteos persistentes de CD4 < 100 cels/μl, una caída por debajo del conteo basal de CD4, o una caída de >50% del valor pico. El fallo virológico se definió como 2 medidas ≥10,0000 copias/ml (mayor umbral) o ≥ 500 copias/ml (menor umbral). Se calcularon medidas de precisión con intervalos de confianza (IC) 95% binomiales exactos.
Resultados: Se incluyeron 2009 pacientes. Durante 1,856 personas-año de seguimiento, 63 pacientes cumplieron los criterios inmunológicos y 35 pacientes (mayor umbral) y 95 pacientes (menor umbral) cumplieron los criterios virológicos. La sensibilidad (intervalo de confianza [IC] 95%) fue del 17.1% (6.6%-33.6%) para el mayor umbral y del 12.6% (6.7%-21.0%) para el menor umbral. Los resultados correspondientes de especificidad fueron 97.1% (96.3%-97.8%) y 97.3% (96.5%-98.0%), para un valor predictivo positivo 9.5% (3.6%-19.6%) y 19.0% (10.2%-30.9%); y para un valor predictivo negativo 98.5% (97.9%-99.0%) y 95.7% (94.7%-96.6%).
Conclusiones: El valor predictivo positivo de los criterios inmunológicos de la OMS para fallo virológico del tratamiento antirretroviral en entornos con recursos limitados es malo, pero el valor predictivo negativo es alto. Los criterios inmunológicos son más apropiados para excluir que para predecir el fallo virológico en entornos con recursos limitados.
In high-income countries the diagnosis of treatment failure and the decision to switch therapy is largely based on plasma viral load monitoring and resistance testing (Hammer et al. 2008). In resource-limited settings, most ART programmes do not have access to viral load testing, but rely on CD4 cell counts and clinical criteria. The World Health Organization (WHO) therefore developed immunological and clinical criteria for treatment failure to guide decisions on when to switch to second-line regimens (World Health Organization 2006). We analysed data from ART programmes in resource-limited settings that monitor both CD4 cell counts and viral load to examine sensitivity, specificity and positive and negative predictive values of the WHO immunological criteria for virological failure of ART.
The ART-LINC collaboration of IeDEA
The ART in Lower Income Countries collaboration of the International epidemiological Databases to Evaluate AIDS (ART-LINC of IeDEA) is a collaborative network of 17 ART programmes in Africa, Latin America and Asia, which has been described in detail elsewhere (Dabis et al. 2005; Keiser et al. 2008b). Briefly, programmes from resource-constrained settings that systematically collect data on patient characteristics and treatment outcomes were eligible for participation in ART-LINC. For the present study, we included all 10 programmes that routinely monitor viral load as well as CD4 counts. Routine viral load monitoring was defined as at least one viral load measurement between 3 and 9 months after starting ART in at least 50% of patients treated at that site. The sites were located in Senegal (Dakar), Uganda (Kampala), South Africa (Cape Town: Gugulethu and Khayelitsha; Johannesburg and Soweto), Morocco (Casablanca), Argentina (Buenos Aires) and Brazil (Rio de Janeiro and Porto Alegre). In all sites Institutional Review Boards approved participation in ART-LINC.
Inclusion criteria and definitions
Since WHO recommends switching to a second-line regimen only after at least 6 months of first-line ART (World Health Organization 2006) we included all ART-naïve patients with two or more CD4 cell counts and viral load measurements between month 6 and 18 after starting ART, who were aged 16 years and older and started ART with a non-nucleoside reverse transcriptase inhibitor (NNRTI)-based regimen. For the purposes of this study, the WHO immunological criteria for treatment failure used were a decline in the CD4 cell count to the baseline value or below, a decline of at least 50% from the highest count on treatment or a persistent CD4 cell count below 100 cells/μl after 6 months of ART (World Health Organization 2006). Virological failure was defined as a viral load of ≥10 000 copies/ml (higher threshold) or as a viral load of ≥500 copies/ml (lower threshold).
We calculated sensitivity, specificity and positive and negative predictive values with binomial exact confidence intervals for the higher and lower viral load thresholds. The first two measurements in the period between month 6 and 18 after starting ART were considered. In a first analysis, we required both measurements to meet the immunological and virological criteria: in practice many patients switch therapy only after failure has been confirmed by a second CD4 cell count or viral load measurement. The date of the second measurement was taken as the date of meeting criteria. In a further analysis only one value meeting the criteria was required. All analyses were performed in stata version 10.1 (Stata Corporation, College Station, TX, USA).
Figure 1 shows that of 11 044 treatment naïve patients aged 16 years or older, 2009 (18.2%) patients met the inclusion criteria. About the same number of patients were excluded because of an insufficient number of viral load measurements among those who did not meet immunological criteria for failure, compared to those who met criteria for immunological failure: 397 of 2343 patients (16.9%) compared to 12 of 75 patients (16.0%).
The number of patients analysed at each site ranged from 37 to 660. Table 1 shows the patient characteristics at the start of ART. The majority of patients were women from Africa. The median age was 34 years and the median year of starting ART was 2004. Treatment was started at a median CD4 cell count of 101 cells/μl and a median viral load of 5.0 log copies/ml. During a total of 1856 person-years of follow up from month 6 to 18 after starting ART, 4759 CD4 counts and 4618 viral load measurements were recorded. Sixty-three patients met the WHO immunological criteria and 35 patients (higher threshold) and 95 patients (lower threshold) met the virological criteria for failure.
Table 1. Baseline characteristics of the 2009 patients included in analyses
Table 2, which summarizes the accuracy of the WHO immunological criteria for virological failure, shows that sensitivity was low, ranging from 12.6% to 48.1% depending on the definition chosen for virological failure (higher or lower threshold) and whether two or only one measurement were required to meet the criteria for failure. Specificity was higher, ranging from 86.8% to 97.3%. The positive predictive value was very poor (9.5–28.7%), whereas negative predictive values were high (88.4–98.5%).
Table 2. Sensitivity, specificity, positive and negative predictive values of World Health Organization (WHO) immunological criteria for virological failure of antiretroviral therapy
Definition of virological failure No. of CD4 count and viral load measurements meeting criteria*
Sensitivity, % (95% CI)
Specificity, % (95% CI)
Positive predictive value, % (95% CI)
Negative predictive value, % (95% CI)
Percentages with binomial exact confidence intervals are shown. TP, true positive; FN, false negative; FP, false positive; TN, true negative.
*Criteria for virological failure as shown in table, criteria for immunological failure as defined by WHO: decline in the CD4 cell count to the baseline value or below, a decline of at least 50% from the highest count on treatment or a persistent CD4 cell count below 100 cells/μl after six months of ART (World Health Organization 2006).
≥10 000 copies/ml
Based on two measurements meeting criteria
Based on one measurement meeting criteria
Based on two measurements meeting criteria
Based on one measurement meeting criteria
Viral load monitoring is the gold standard used in high-income countries to diagnose failure of ART, but it is not generally available in resource-limited settings. CD4 cell counts and clinical outcomes are used to monitor treatment in the absence of viral load. Our results show that the positive predictive value of the immunological (CD4 cell count) criteria for failure defined by WHO is poor for virological failure: depending on the definitions chosen, only about 10–30% of patients who met immunological criteria had virological failure. The negative predictive value was, however, considerably higher. For example, 95.7% of patients with CD4 counts not meeting the immunological criteria for failure had viral loads <500 copies/ml.
Our study has a number of strengths. Although most patients treated at sites participating in the ART-LINC of IeDEA network do not have access to routine viral load monitoring, a minority of patients did, and they made this study possible. More than 2000 patients from six countries in Africa and South America could be included in the analysis. A limitation of our study is that we could not consider clinical outcomes. Some sites do not systematically collect data on clinical events and in sites that record clinical events the diagnostic capacities and definitions vary. It was therefore not possible to examine the correlation between clinical criteria and the laboratory criteria considered in our study. Notably, clinical failures were rare in a treatment programme in three African countries that included monitoring of CD4 cell counts and viral load (Palombi et al. 2009). It is therefore unlikely that the inclusion of clinical failure would have substantially changed our results. Finally, we stress that the sites included in our study may not be typical for all sites providing ART in these countries: they represent a sample of programmes with electronic medical record systems (Forster et al. 2008) and access to viral load monitoring. Their rate of immunological failure was, however, similar to that observed in ART-LINC sites without access to viral load monitoring (data not shown).
Several previous studies have shown poor concordance between CD4 and viral response. A study from the USA showed that a lack of increase in CD4 cell counts at one year had a sensitivity of 35% and a specificity of 94% in predicting viral load suppression (Moore et al. 2006). A study of South African gold miners found that WHO clinical and CD4 criteria had both poor sensitivity and poor specificity in detecting virological failure (Mee et al. 2008). Similarly, a study from Botswana found that an increase in CD4 cell count after initiating ART was only moderately accurate in identifying patients with undetectable viral load (Bisson et al. 2006). Recently, Badri et al. (2008) used data from the Cape Town AIDS cohort, where viral load was measured every 3 months, to show that CD4 count changes correlated with viral load at cohort level, but had limited utility in identifying virological failure in individual patients. An alternative method to identify treatment failure might be the assessment of adherence. A study of a private health care management programme in nine countries in Southern Africa showed that monitoring adherence may be more effective in predicting virological failure than declining CD4 cell counts (Bisson et al. 2008). Unfortunately, we did not have information on adherence in our study.
We focussed on the predictive values of immunological criteria, which will be most relevant to clinicians interpreting CD4 cell counts. Our study illustrates that the power of a test to rule a diagnosis in or out does not only depend on its specificity or sensitivity, as suggested by the SpPIn (high specificity, positive, rules in) and SnNOut (high sensitivity, negative, rules out) rules promoted by the proponents of evidence-based medicine (Pewsner et al. 2004). The power of the WHO immunological criteria to rule virological failure in was reduced dramatically by their low sensitivity, despite their high specificity. Similarly, the power to rule out depends on both sensitivity and specificity. In our study, the negative predictive values were high not because the immunological criteria were a powerful test, but because only few patients developed virological failure: predictive values will also depend on the incidence of virological failure in different programmes. Remarkably, we found in a previous study (Keiser et al. 2008a) that the probability of viral rebound was closely similar in South African townships and Switzerland, indicating that the rate of virological failure may not vary substantially across settings.
A limitation of our study is that it was not designed to evaluate diagnostic criteria but analysed routine clinical data, which may have introduced bias. For example, if the reference test (i.e. viral load measurement) is not applied consistently to confirm negative results of the index test (i.e. CD4 counts not indicating immunological failure), partial verification or work-up bias may be introduced (Whiting et al. 2004). In our study this may have led to overestimation or underestimation of predictive values (Pewsner et al. 2004). However, substantial work-up bias is unlikely: the proportion of patients excluded because of an insufficient number of viral load measurements was similar among those who did not meet immunological criteria for failure and those who did. Missing viral load data may nevertheless have affected our estimates of predictive values, but conclusions would not have changed: assuming that all patients with missing viral load measurements had virological failure would increase the positive predictive value to a maximum of 32.0% and reduce the negative predictive value to 79.5%, based on the lower threshold for virological failure and two measurements meeting criteria.
There is debate on the feasibility and cost-effectiveness of viral load monitoring in the context of scaling up of ART in resource-limited settings (Calmy et al. 2007; Phillips et al. 2008; Walensky et al. 2008). WHO stipulates that viral load monitoring is desirable, but not essential, for a public health approach to ART (Gilks et al. 2006). We recently analysed rates of switching from non-nucleoside reverse transcriptase inhibitor-based first-line regimens to protease inhibitor-based regimens in Africa, South America and Asia (Keiser et al. in press). We found that patients tended to switch earlier and at higher CD4 cell counts in programmes with, compared to programmes without, access to viral load monitoring. Clearly, further work is required to investigate whether other variables exist that could improve prediction of treatment failure in programmes without access to viral load monitoring, whether simplified techniques to measure viral load can be implemented, and in what intervals patients should optimally be monitored. Finally, future studies should examine the long-term clinical progression and mortality of patients meeting and not meeting the clinical, immunological and virological WHO criteria for treatment failure, and of patients switching and not switching to second-line regimens.
The ART-LINC collaboration of the International epidemiological Databases to Evaluate AIDS (IeDEA) is funded by the US National Institutes of Health (Office of AIDS Research and National Institute of Allergy and Infectious Diseases) and the French Agence Nationale de Recherches sur le Sida et les Hépatites Virales (ANRS). We are grateful to Hannock Tweya, Paula Braitstein, Martin Brinkhof, Suely Tuboi, Mar Pujades-Rodriguez, Alexandra Calmy, Nagalingeswaran Kumarasamy, Denis Nash, Andreas Jahn, Ruedi Lüthy and Mina Hosseinipour for helpful comments.
The ART-LINC of IeDEA Central Coordinating Team: Eric Balestre, Martin Brinkhof, François Dabis (principal investigator), Matthias Egger (principal investigator), Claire Graber, Beatrice Fatzer, Olivia Keiser, Charlotte Lewden, Mar Pujades, Mauro Schechter (principal investigator).
Collaborating centres: ANRS 1290 (Dakar, Senegal); Adherence Monitoring Uganda (AMU) cohort; Gugulethu ART Programme, (Cape Town, South Africa); Khayelitsha ART Programme, (Cape Town, South Africa); Themba Lethu/WITS (Johannesburg, South Africa); Perinatal HIV Research Unit (Soweto, South Africa); Morocco Antiretroviral Treatment Cohort, Centre Hospitalier Universitaire (Casablanca, Morocco); Prospective Evaluation in the Use and Monitoring of Antiretrovirals in Argentina (PUMA), Buenos Aires, Argentina; South Brazil HIV Cohort (SOBRHIV), Hospital de Clinicas (Porto Alegre, Brazil); Rio de Janeiro HIV Cohort, Hospital Universitario Clementino Fraga Filho (Rio de Janeiro, Brazil).