HIV‐1 transmitted drug resistance surveillance: shifting trends in study design and prevalence estimates

Abstract Introduction HIV‐1 transmitted drug resistance (TDR) prevalence increased during the initial years of the antiretroviral therapy (ART) global scale‐up. Few studies have examined recent trends in TDR prevalence using published genetic sequences and described the characteristics of ART‐naïve persons from whom these published sequences have been obtained. Methods We identified 125 studies published between 2014 and 2019 for which HIV‐1 reverse transcriptase (RT) with or without protease from ≥50 ART‐naïve adult persons were submitted to the GenBank sequence database. The population characteristics and TDR prevalence were compared to those in 122 studies published in the preceding five years between 2009 and 2013. TDR prevalence was analysed using median study‐level and person‐level data. Results and discussion The 2009 to 2013 and 2014 to 2019 studies reported sequence data from 32,866 and 41,724 ART‐naïve persons respectively. Studies from the low‐ and middle‐income country (LMIC) regions in sub‐Saharan Africa, South/Southeast Asia and Latin America/Caribbean accounted for approximately two‐thirds of the studies during each period. Between the two periods, the proportion of studies from sub‐Saharan Africa and from South/Southeast Asia countries other than China decreased from 43% to 32% and the proportion of studies performed at sentinel sites for recent HIV‐1 infection decreased from 33% to 22%. Between 2014 and 2019, median study‐level TDR prevalence was 4.1% in South/Southeast Asia, 6.0% in sub‐Saharan Africa, 9.1% in Latin America/Caribbean, 8.5% in Europe and 14.2% in North America. In the person‐level analysis, there was an increase in overall, NNRTI and two‐class NRTI/NNRTI resistance in sub‐Saharan Africa; an increase in NNRTI resistance in Latin America/Caribbean, and an increase in overall, NNRTI and PI resistance in North America. Conclusions Overall, NNRTI and dual NRTI/NNRTI‐associated TDR prevalence was significantly higher in sub‐Saharan Africa studies published between 2014 and 2019 compared with those published between 2009 and 2013. The decreasing proportion of studies from the hardest hit LMIC regions and the shift away from sentinel sites for recent infection suggests that global TDR surveillance efforts and publication of findings require renewed emphasis.


| INTRODUCTION
HIV-1 drug resistance (HIVDR) testing is not routinely available for clinical management in most low-and middle-income countries (LMICs), which shoulder the largest global burden of HIV-1 infection. The choice of initial antiretroviral therapy (ART) in LMICs is thus informed by population-level HIVDR prevalence estimates derived from persons initiating therapy. WHO initially recommended methods to classify the prevalence of transmitted drug resistance (TDR) in persons likely to have been recently infected with HIV. Such surveys were designed to maximize inclusion of virus with drug resistance mutations (DRM) that had been transmitted before they were outcompeted by more fit wild-type revertants [1,2]. Such surveys, however, were not nationally representative and in low incidence settings, restrictive participant inclusion criteria made them challenging to complete enrolment in a reasonable time period [3].
In 2015, the WHO modified its HIV drug resistance surveillance strategy to focus primarily on populations initiating firstline ART; resistance detected in these surveys was called pretreatment drug resistance (PDR) [4]. In contrast to studies in which all persons were antiretroviral (ARV) drug-na€ ıve, PDR surveys allowed for the inclusion of persons initiating (or reinitiating) first-line ART who may have previously received ARV drugs such as those who received ARVs for the prevention of mother-to-child transmission or in whom prior ART was interrupted. This surveillance strategy assesses the overall burden of HIVDR in the population initiating and reinitiating ART and best supports ART guideline optimization within the public health approach in most LMICs.
WHO spearheaded two systematic reviews of HIVDR in previously ART-na€ ıve persons from LMICs. The first published in 2012, reported an increase in TDR in southern and eastern Africa between 2001 and 2011 [5]. The second review published in 2018, examined PDR studies through 2016 and reported that in multiple regions in sub-Saharan Africa and in Latin America/Caribbean region, non-nucleoside reverse transcriptase inhibitor (NNRTI) PDR levels had approached or exceeded 10% [6]. Similar high levels of NNRTI-associated TDR and/or PDR have been reported in systematic reviews from South Africa [7] and the Latin America/Caribbean region [8,9].
In 2015, we published a systematic review and meta-analysis of global TDR trends using data from 287 studies published between 2000 and 2013. Our 2015 meta-analysis differed from previous reviews and meta-analyses because it included only studies for which HIV-1 sequences had been submitted to GenBank [10]. The availability of all sequences made it possible to perform individual person-level analyses and to analyse trends in the prevalence of each drug resistance mutation. The study reported a yearly 1.09-fold increase in the odds of TDR since global ART scale-up programmes began and reported that NNRTI resistance increased in five regions including sub-Saharan Africa, Latin America/Caribbean, North America, Europe, as well as in upper-income Asian countries.
In this study, we describe TDR prevalence data collected in the six years since the completion of our previous meta-analysis (2014-2019). We compare the population characteristics and prevalence of TDR in recent studies to those studies published in the preceding five-year period between 2009 and 2013 meeting the same inclusion criteria. Like our previous study, the current analysis includes only those studies for which sequences are publicly available.

| Study inclusion criteria
All published HIV-1 group M pol nucleic acid sequences containing the reverse transcriptase (RT) gene of HIV-1 with or without the protease gene submitted between 1 January 2014 and 15 December 2019 were retrieved using a BLAST search of the GenBank viral sequence database V. 235 (released 15 December 2019) (Table S1). Sequences with the same GenBank "Author" and "Title" fields were grouped into submission sets. The GenBank annotation and associated published papers were reviewed to identify submission sets (studies) describing ≥50 ART-na€ ıve infected individuals containing sequences encompassing RT codons 40 to 240. We compared the data from these studies with the data from those published between 2009 and 2013 meeting the same criteria.
Studies of individuals initiating first-line ART were excluded if they also included individuals with any previous ART exposure. Studies that included children born to mothers receiving ARTs to prevent mother-to-child transmission were excluded as were studies of populations whose viruses were sequenced based on knowledge of their HIV drug resistance status (i.e. pretherapy sequences from persons who subsequently developed virological failure and/or HIVDR). Next-generation sequencing (NGS) studies were excluded if the threshold for mutation detection was <15% or not reported.

| Data collection
For each study, the following information was collected: (i) Country and year of sampling; (ii) Location at which samples for genotypic resistance testing were obtained including HIV clinics, voluntary counselling and testing centres, antenatal clinics, blood transfusion centres, sexually transmitted diseases clinics and clinics for persons who inject drugs. For some studies, however, information was provided only on the location at which samples underwent genotypic resistance testing (e.g. a reference or public health laboratory). Sentinel sites for HIV-1 surveillance were defined as sites other than HIV clinics at which genotypic resistance testing was performed; (iii) Stated purpose of the study, such as whether it was performed to estimate TDR prevalence, assess sequence diversity or characterize a transmission network; (iv) Whether the population predominantly comprised recently infected persons; (v) Specimens submitted for sequencing such as plasma, peripheral blood mononuclear cells (PBMCs) and dried blood spots; (vi) Sequencing method used: Sanger sequencing versus NGS. For NGS, the mutation detection threshold was recorded.
Studies meeting inclusion criteria were assigned to one of the following geo-economic regions: (i) sub-Saharan Africa; (ii) LMICs of South and Southeast Asia; (iii) Latin America and Caribbean; (iv) Europe; (v) United States, Canada and Puerto Rico (North America); (vi) Upper-income Asian countries; (vii) Countries of the former Soviet Union; (viii) North Africa (Middleeast) and (ix) Australia. For studies conducted in countries in different regions, separate datasets for each region were created, provided the study had more than 50 individuals per region. LMICs in sub-Saharan Africa, South/Southeast Asia and Latin America/Caribbean were defined according to WHO [4].

| Sequence analyses
TDR was defined as the presence of one or more mutations from the WHO 2009 list of surveillance drug resistance mutations (SDRMs) [11]. The SDRM list consists of 93 drug resistance mutations including 34 NRTI resistance mutations at 15 RT positions, 19 NNRTI resistance mutations at 10 RT positions and 40 PI resistance mutations at 18 protease positions.
The Calibrated Population Resistance (CPR) analysis tool (https://hivdb.stanford.edu/cpr/) was used to calculate the proportions of individuals per study with overall NRTI, NNRTI and PI-associated TDR [12]. HIV-1 subtype was determined using the COMET HIV-1 Subtyping Tool [13]. For each study, the epidemiologic characteristics, TDR prevalence and CPR analysis can be accessed at http://hivdb.stanford.edu/surveilla nce/map/.
Tenofovir resistance was defined as the presence of any one of the following four sets of mutations: (i) K65R, K70E and Y115F; (ii) the thymidine-analogue mutations (TAMs) T215Y/F, which in contrast to the remaining TAMs reduce response to tenofovir-containing regimens [14]; (iii) the multidrug resistance mutations T69S_SS and Q151M and (iv) several additional non-polymorphic DRMs not on the SDRM list including A62V, K65N, T69del and K70G/Q/N/S/T [15].

| TDR prevalence
TDR prevalence was examined using study-level analyses in which the median overall and ART-class TDR prevalences were represented by summary values from each study and individual-level analyses in which individuals from all studies were pooled. Study-level analyses compared TDR prevalence in studies submitted to GenBank between 2014 and 2019 and those submitted to GenBank between 2009 and 2013 that contained sequences from ≥50 ARV-na€ ıve adults. The median prevalence of overall, NRTI, NNRTI and/or PI TDR was compared within those regions for which five or more studies were available in both periods using the Wilcoxon Rank Sum test.
Individual-level analyses were also performed to examine temporal trends in pooled virus sequences from each region obtained since 2009. This analysis excluded sequences obtained prior to 2009, which represented a substantial proportion of sequences described in the 2009 to 2013 studies and a small proportion of sequences described in the 2014 to 2019 studies. To examine temporal trends since 2009, the overall population of pooled sequences by sample year was bisected and TDR prevalence between the two time periods (2009 to 2011 and 2012 to 2018) was compared using the Fisher Exact test. In a complementary analysis, a generalized linear mixed logistic regression analysis was used to relate the sample year to TDR prevalence. To account for study heterogeneity, study was included in the model as a random effect using the R package lme4.

| Study population
The 2014 to 2019 GenBank search identified 125 studies meeting study inclusion criteria. Figure 1 displays a flow chart summarizing the process by which studies were reviewed to determine whether they met inclusion criteria. All but four studies were linked to a peer-reviewed publication. These studies contained RT sequences from 41,724 individuals; a subset of 116 studies contained protease sequences from 40,084 individuals. The list of 125 included studies is provided in Table S2. Between the two study periods, the proportion of studies in sub-Saharan Africa decreased from 29% to 22% (Fisher Exact test; p = 0.2), whereas the number in South/Southeast Asia increased from 24% to 35% (Fisher Exact test; p = 0.05). During the first study period, 41% of the 29 studies in South/ Southeast Asia were performed in China. In the second study period, 71% of the 44 studies in South/Southeast Asia were performed in China. Therefore, the proportion of studies from South/Southeast Asia outside of China decreased from 14% (17 studies) to 10% (13 studies).
For the 2009 to 2013 studies, the most common primary stated study purposes were to assess TDR prevalence (79%), to characterize sequences diversity for molecular epidemiologic purposes or vaccine development (14%), or to study transmission dynamics using sequence networks (3%). For the 2014 to 2019 periods, the proportion of studies designed for assessing TDR prevalence decreased to 45%, whereas studies to characterize sequence diversity for molecular epidemiologic purposes or vaccine development (32%) and to study transmission dynamics using sequence networks (16.1%) increased.
For the 2009 to 2013 studies, the most common recruitment sites were HIV clinics (55%), sentinel sites for HIV-1 surveillance including blood donation centres, antenatal clinics, voluntary counselling testing sites and injection drug user clinics (33%) and regional reference or public health laboratories (10%). For the 2014 to 2019 studies, the most common recruitment sites were HIV clinics (41%), regional reference or public health laboratories (26%) and sentinel sites for HIV surveillance (22%). During both study periods, approximately 92% of specimens was plasma, whereas the remaining specimens' types were PBMCs and dried blood spots. Between 2014 and 2019, Gen-Bank contained consensus NGS sequences from four studies with 50 or more ARV-na€ ıve persons. However, only one of these studies was included in this review because the consensus sequences used a mutation detection threshold that was below 1% in two studies and that was not reported in a third study. The consensus sequence from the fourth study used a 20% mutation-detection threshold. There was a large NGS study in the NCBI Sequence Read Archive that did not have a consensus sequence in GenBank and was therefore not included in this analysis [16].

| TDR prevalence
In the 2014 to 2019 studies, the median study-level TDR prevalence was 6.0% in sub-Saharan Africa and 4.1% in South/ Southeast Asia. By comparison, the median study-level TDR prevalence in Europe, the upper-income Asian countries, Latin America/Caribbean and North America, ranged from 8.5% to 14.2%. Between the two study periods, there was a statistically significant study-level increase in overall TDR in sub-Saharan Africa and North America, NNRTI TDR in sub-Saharan Africa and PI TDR in South/Southeast Asia and North America, two-class NRTI/NNRTI resistance in sub-Saharan Africa (Table 2, Figure 3). There were no consistent regional or temporal differences in TDR prevalence estimates between different categories of recruitment site. There were no consistent regional or temporal differences in TDR prevalence between studies performed specifically for estimating TDR prevalence compared to those performed for other purposes. Figure 3 shows that there are six studies for which the overall TDR prevalence was outliers having overall TDR prevalence above 20%. Two of these were published in the United States between 2009 and 2013 [17,18]. One study included samples from 91 chronically infected persons and the other included samples from 662 recently diagnosed persons. Four of the outlier studies were published between 2014 and 2019 including studies of 59 fisherman along Lake Victoria in Kenya, 141 persons prior to starting ART in Cuba, 118 newly diagnosed persons in Croatia and 289 hospitalized patients in Portugal [19][20][21][22]. The study from Croatia contained a large cluster of persons with viruses containing a single SDRM   Table 3 bisects the pooled sequences into approximately equal numbers spanning the years 2009 to 2011 and 2012 to 2018 and compares the proportions of individuals with TDR during these time periods by region and ARV drug class. This analysis shows that the prevalence of overall TDR increased in sub-Saharan Africa and North America but decreased in Europe. The prevalence of NNRTI TDR increased in sub-Saharan Africa, North America and Latin America/Caribbean, whereas the prevalence of NRTI TDR decreased in Europe. The prevalence of two-class NRTI/NNRTI resistance increased in sub-Saharan Africa. The same significant trends were observed in the regression analysis relating sample year to TDR prevalence (Table 4).

| Distribution of DRMs
Eight NNRTI-associated mutations had a prevalence above 0.1%. The prevalence of three of these mutations (K103N, Y181C and Y188L) increased by more than 2-fold in sub-Saharan Africa (Table 5). Among the NRTI resistance mutations, K65R had a significant increase in prevalence from 0.03% to 0.35% in sub-Saharan Africa, whereas M184V had a non-significant increase from 0.65% to 1.05%. The NNRTI resistance mutations K103N and V106M also increased in prevalence in the Latin America/Caribbean region. Of note, K65R and V106M are preferentially selected in subtype C viruses which are highly prevalent in sub-Saharan Africa [23,24]. There were no significant increases of any DRMs in South/Southeast Asia. The Table S3 lists the prevalence of each DRM in each region by time period.

| DISCUSSION
During the six-year period between 2014 and 2019, 125 studies comprising 50 or more ARV-na€ ıve persons were submitted to GenBank. The populations and TDR prevalence in these studies were compared to data from 122 studies meeting the same criteria submitted to GenBank between 2009 and 2013. In sub-Saharan Africa, there was an increase in the median study-level prevalence of overall and NNRTI-associated TDR in the 2014 to 2019 period, whereas individual patient-level analyses demonstrated these increases and an increase in dual NRTI/NNRTI-associated TDR. Additional significant trends included an increase in NNRTI-associated TDR in the Latin America/Caribbean region and an increase in overall, NNRTI and PI-associated resistance in North America by both study-level and individual patient-level analysis.
During both time periods, there was a gradient in the prevalence of TDR with the lowest levels in South/Southeast Asia and sub-Saharan Africa; intermediate levels in Europe, the Latin America/Caribbean region and the upper-income Asian countries; and the highest levels in North America. Between the two time periods, TDR prevalence levels in sub-Saharan Africa increased relative to South/Southeast Asia, whereas those in Europe decreased relative to the Latin America/Caribbean region.
The high rates of TDR in North America reported here are consistent with recent U.S. CDC surveys presented at scientific meetings but not included in our analysis [25,26]. The reasons for the higher TDR rates in the United States than in other upper-income countries may be due to the higher retention in care outside of the United States, where ART is provided free of charge [27,28]. Although PrEP has been increasingly used in the United States, its use was unlikely to have influenced TDR incidence because the mutations selected by PrEP, M184V and K65R, occurred rarely throughout the study [29]. TDR is a less significant public health problem in upper-income countries because pretherapy genotypic resistance testing is usually available to identify persons with TDR and to adjust first-line therapy accordingly. Moreover, TDR in upper-income countries often results from the onward transmission of strains containing mutations associated with ARVs that have been used in many years [26,[30][31][32][33]. In LMICs, however, TDR strains are more likely to contain mutations derived directly from treated persons such M184V and K65R, that have not had the opportunity to revert to wildtype [34].
This study also reveals trends in how sequence data on HIV-1-associated TDR are being obtained. First, the proportion of studies from sub-Saharan Africa decreased from 29% to 22% and those from South/Southeast Asia other than China decreased from 14% to 10%. Second, there were several differences in study design that influenced the nature of the population studied during the two time periods. Compared with the earlier study period, a smaller proportion of studies published between 2014 and 2019 were performed expressly for the purpose of monitoring TDR (79% vs. 45%; Fisher Exact test; p < 0.001) and fewer were performed at sentinel sites for HIV-1 surveillance (33% vs. 22%; Fisher Exact test; p = 0.09). Although there were no consistent regional or temporal differences in TDR prevalence between studies performed for assessing TDR prevalence and those performed for other purposes, the above trends suggest that there is a reduced investment in studies designed specifically for TDR surveillance particularly in those regions for which TDR prevalence data are most needed.
In contrast to the most recent WHO systematic review [6], we excluded studies containing ART-experienced persons presenting for initial therapy. During the 2014 to 2019 study period, nine such studies were excludedseven from sub-Saharan Africa and two from South/Southeast Asia. Although for each region, a generalized linear mixed model was used to assess the yearly change in the odds (OR) of TDR accounting for study heterogeneity using the R package lme4. The model included the categorical outcome variable indicating the presence or absence of TDR and the two explanatory variables, the sample year as a fixed-effect term and the study as a random-effect term; The significant increase with p < 0.05 is indicated in bold. 7% to 40% of persons in these studies had a history of previous ARV use, the median PDR prevalence in these studies (6.4%; range: 3.4% to 13.3%) did not differ from the TDR prevalence in the 125 studies containing entirely ART-na€ ıve populations. However, PDR prevalences can be several-fold higher among persons with previous ARV compared to those documented to be ART-na€ ıve [9,35] because they depend in part on the extent to which patients cycle in and out of care rather than the likelihood that a person will be primarily infected by a drug-resistant virus. Therefore, for epidemiological purposes it remains valuable to monitor drug resistance in ART-na€ ıve individuals to document the prevalence and patterns of TDR strains, assess "hot spots" of transmission of drug-resistant virus to identify individuals at risk for having TDR at time of diagnosis and to implement public health measure to interrupt TDR transmission.
Of the 247 studies in our review, 132 (53.4%) were included in the 2018 WHO meta-analysis [6]. The additional 115 studies in our review included 37 studies from LMICs published since 2016 and 78 studies from upper-income countries. The WHO meta-analysis also contained 190 studies that we excluded because they were published prior to 2009 or were unpublished, contained fewer than 50 persons, lacked publicly available sequences, and/or included persons with previous ARV use.
Our approach to analysing TDR trends has several limitations. First, some countries are better able to conduct surveillance studies. This points to the importance of the WHO global surveillance programme which supports surveillance studies in resource-limited areas [36]. However, the number of studies conducted by WHO supported laboratories is not high [9]. Second, by relying entirely on publicly available sequences, we were confined to the subset of studies submitted to GenBank. This can introduce a bias in that researchers from some regions may be more likely to submit their sequences to GenBank. For example a large proportion of the studies from South/Southeast Asia was from China. Third, not all studies were performed in order to determine TDR prevalence. Although TDR prevalence estimates did not differ between those conducted for surveillance purposes and those conducted for other reasons, differences in study design may explain some of the heterogeneity within each of the regions. Additionally, non-disclosed past ART use may have influenced TDR estimates in our studies as well as in PDR surveys.
The availability of sequences from each study participant made it possible to track geographic and temporal trends in the prevalence of SDRMs as well as mutations that are not SDRMs such as the recently identified but as yet uncommon non-polymorphic tenofovir-associated mutations [15]. The availability of sequences from the studies presented here make it possible for other researchers to use these data to analyse the association of different mutations with different subtypes, the genetic relatedness of sequences containing different DRMs and the proportions of positions in each sequence containing ambiguitiesan indicator of recent infection [10]. As more research and clinical laboratories use NGS for genotypic resistance testing, it will be important for their data to be submitted to the NCBI Sequence Read Archive so that their data can be analysed using different mutation thresholds. Consensus NGS sequences submitted to GenBank may be difficult to be included in meta-analyses, if their mutation detection thresholds are set too low.

| CONCLUSIONS
Published studies of HIV-1 pol sequences in ART-na€ ıve populations provide important insights into global TDR trends. However, the decreasing proportion of studies from the hardest hit low-and middle-income country regions and the shift away from sentinel sites for recent infection suggests that global TDR surveillance efforts require renewed emphasis.

C O M P E T I N G I N T E R E S T S
We declare no competing interests.

A U T H O R S ' C O N T R I B U T I O N S
SYR and RWS designed the study, analysed the data, and were the primary writers of the paper. SYR, SGK, GB and JCS screened titles, abstracts and fulltext articles for inclusion and extracted data from included studies. SYR performed statistical analysis. SGK, GB and MRJ provided oversight, critical feedback and interpretation of results. All authors contributed to writing the manuscript and approved the final version.

D I S C L A I M E R
The funder of the study had no role in data collection or data analysis.

R E F E R E N C E S SUPPORTING INFORMATION
Additional information may be found under the Supporting Information tab for this article. Table S1. Search terms and strategy Table S2. Studies which met inclusion criteria Table S3. Prevalence of each DRM in each region by time period