Retention in care and viral suppression in differentiated service delivery models for HIV treatment delivery in sub‐Saharan Africa: a rapid systematic review

Abstract Introduction Differentiated service delivery (DSD) models for antiretroviral treatment (ART) for HIV are being scaled up in the expectation that they will better meet the needs of patients, improve the quality and efficiency of treatment delivery and reduce costs while maintaining at least equivalent clinical outcomes. We reviewed the recent literature on DSD models to describe what is known about clinical outcomes. Methods We conducted a rapid systematic review of peer‐reviewed publications in PubMed, Embase and the Web of Science and major international conference abstracts that reported outcomes of DSD models for the provision of ART in sub‐Saharan Africa from January 1, 2016 to September 12, 2019. Sources reporting standard clinical HIV treatment metrics, primarily retention in care and viral load suppression, were reviewed and categorized by DSD model and source quality assessed. Results and discussion Twenty‐nine papers and abstracts describing 37 DSD models and reporting 52 discrete outcomes met search inclusion criteria. Of the 37 models, 7 (19%) were facility‐based individual models, 12 (32%) out‐of‐facility‐based individual models, 5 (14%) client‐led groups and 13 (35%) healthcare worker‐led groups. Retention was reported for 29 (78%) of the models and viral suppression for 22 (59%). Where a comparison with conventional care was provided, retention in most DSD models was within 5% of that for conventional care; where no comparison was provided, retention generally exceeded 80% (range 47% to 100%). For viral suppression, all those with a comparison to conventional care reported a small increase in suppression in the DSD model; reported suppression exceeded 90% (range 77% to 98%) in 11/21 models. Analysis was limited by the extensive heterogeneity of study designs, outcomes, models and populations. Most sources did not provide comparisons with conventional care, and metrics for assessing outcomes varied widely and were in many cases poorly defined. Conclusions Existing evidence on the clinical outcomes of DSD models for HIV treatment in sub‐Saharan Africa is limited in both quantity and quality but suggests that retention in care and viral suppression are roughly equivalent to those in conventional models of care.


| INTRODUCTION
Throughout sub-Saharan Africa, most national HIV programmes are striving to achieve the 95-95-95 targets for HIV diagnosis, treatment and viral suppression [1]. The rapid expansion of antiretroviral therapy (ART) programmes to reach these targets has created shortfalls in health system capacity and quality [2]. In response, many countries are scaling up alternative service delivery approaches, or differentiated service delivery (DSD) models. DSD models differ from conventional HIV care in the location and frequency of interactions with the healthcare system, cadre of provider involved, and/or types of services provided [3]. Grimsrud and colleagues [4] broadly categorize DSD models as individual or group models, with service delivery at a facility or in the community. DSD models aim to achieve a wide range of potential benefits to both providers and patients. The attractiveness of DSD models is generally considered to be conditional on maintaining at least equivalent clinical outcomes to conventional care; assuming no deterioration in clinical outcomes, DSD models are hoped to generate greater patient satisfaction, lower cost to both providers and patients and create efficient and convenient service delivery.
Despite the large-scale rollout of DSD models in various formats across multiple countries, there is a dearth of evidence to document the purported benefits of the new models in routine implementation. Even the minimum requirement of equivalent clinical outcomes is poorly documented for most models and settings. The studies and evaluations available are widely inconsistent in their designs, methods and outcomes, making it difficult to draw an overall picture of the impact of the models. Monitoring and evaluation systems have not kept up with DSD model implementation, and DSD participation is poorly captured in routine records, making it challenging to compare outcomes in DSD models with those in conventional care [5]. The information available to policy makers, funders and programme implementers is thus incomplete and difficult to interpret.
To help fill this gap and create a baseline to guide future research, we conducted a rapid systematic review of the most recent peer-reviewed reports of the outcomes of DSD model implementation in sub-Saharan Africa. In view of the importance of achieving non-inferior clinical outcomes as a condition for adopting DSD models, we report here the results of our search for retention in care, viral suppression and related clinical outcomes.

| METHODS
Following World Health Organization guidance for rapid reviews [6], we conducted a rapid systematic review of peerreviewed publications and conference abstracts that reported outcomes of differentiated service delivery (DSD) models for the provision of antiretroviral treatment (ART) in sub-Saharan Africa since 2016 [7]. The search protocol was previously presented [7], and the review was registered on the International Prospective Register of Systematic Reviews (PROSPERO CRD42019118230).
Although the full review included a wide range of outcomes for both providers and patients, the most widely available information pertained to patient-level clinical outcomes, specifically retention in care and viral suppression. In this report we focus on these outcomes only, to allow for a more detailed examination and discussion of consistently defined indicators. The full report of the review is available online [8].

| Search strategy and study selection
For this review, we adopted and modified the widely-cited frameworks put forward by Grimsrud et al [4] and Duncombe et al [3] and defined as a "differentiated model of service delivery" any approach to providing ART that focused on a specific population, the location of service delivery, the frequency of patient interaction with the healthcare system, or the cadre of healthcare provider involved [9]. We did not consider a change in services provided, without adjustment of any other characteristics, to constitute a DSD model. DSD models for all populations except for pregnant women in PMTCT programmes and clients on ART for HIV prevention (PEP or PrEP) were included in this review. A full list of inclusion and exclusion criteria for the review are shown in Table S1.
We searched the PubMed, Embase and Web of Science databases with a search string developed to identify publications which reported on HIV treatment delivery models in sub-Saharan Africa from 1 January 2016 until 12 September 2019. The final search was conducted on 12 September 2019. We supplemented the peer-reviewed publications by manually searching peer-reviewed abstracts from major conferences for the same period. Search strings and a full list of conferences included can be found in Table S2. Limiting eligible articles and abstracts to those published or presented since 2016 was intended to ensure that results come as close as possible to reflecting the current state of DSD model implementation and to avoid repeating the efforts of previous reviews [2,[10][11][12][13]. If a source reported patient follow-up data collected both before and after January 1, 2016, we included it only if the majority of follow-up time (more than 50%), as stated in the source or estimated by the authors, occurred after that date. Therefore, the bulk of the implementation period for the models included is 2016 or later.
We excluded sources that reported interventions aimed at improving conventional care that we judged did not in themselves comprise DSD models, such as adherence interventions that strengthened existing counselling or offered incentives for retention within the conventional model of care. We also excluded cross-sectional surveys of patients or providers who were asked to comment on DSD models but did not have personal experience with it. If two source documents described what we determined to be the same cohort of patients enrolled in the same instance of the model, we counted only one model but cited both references for it. If one source document superseded another, for example by reporting more complete data or longer term outcomes, we kept only the more informative source. Where full conference presentations or posters were available, we used these rather than the abstracts. If two source documents reported data on the same study, we included the one with the most recent results.
All peer-reviewed references identified using the respective search strings from PubMed, Embase and Web of Science were imported into an EndNote TM library, where deduplication occurred. An initial, independent, blinded review (reviewers were not aware of each other's decisions) of the titles and abstracts was conducted by three study team members (SK, RC, CG) using Rayyan QCRI [14]. A full-text review was then conducted for all publications remaining after the initial review by two study team members (SK, CG). Reasons for excluding publications were recorded during the full-text review. As a quality check, another author (LL) also checked a sample (10%) of the excluded sources against exclusion criteria. At each stage of the review process, any conflicts between reviewers were assessed and resolved through the consensus of two authors (LL, SR). The results of the search were documented in accordance with the PRISMA-P reporting checklist (Text S1) [15,16].

| Data extraction
The data extraction tool was designed to capture each DSD model separately, regardless of whether the source publication described one or many models. In addition to standard bibliographic descriptors, we collected two types of data: a) a detailed description of the model of service delivery; and b) the outcomes that were reported for the model. We categorized each model according to the taxonomy described by Grimsrud [4], with four categories: facility-based individual models, out-of-facility-based individual models, healthcare worker-led groups and client-led groups. We then used the adapted Duncombe [3] schema to describe the model in terms of population, provider, location, frequency and services provided as well as and its outcomes. Where a comparison was provided with the pre-or non-differentiated standard of care, we also extracted data about these comparison models, henceforth referred to as conventional care.

| Outcomes
We report here standard clinical HIV treatment metrics, including retention in care, viral load suppression, adherence and pharmacy refill rates. We used each source's own definition and timing of these outcomes, accepting that definitions for "retention in care" vary widely, as do thresholds for determining viral suppression. Retention usually referred to the proportion of patients enrolled in a DSD model and retained in the ART programme at a specific time point after enrolment in the study. The point at which a patient was considered no longer in care (i.e. not retained) varied by study or country. Where a loss to follow-up (LTFU) proportion was reported, we converted it to a retention rate (as 100-LTFU%). Most sources defined viral suppression as <1000 copies/mL. Viral suppression was not always reported among those retained in care. Adherence and prescription refill frequency were uncommon outcomes but are included in this analysis when reported. Other outcomes from the full review, such as costs to providers and patients, can be found elsewhere. [17,18]

| Analysis
To structure the results, we first divided the models into the four categories mentioned above: facility-based individual models (FBIM), out-of-facility-based individual models (OFBIM), client-led groups (CLG) and healthcare worker-led groups (HCWLG). In publications where more than one model was described, we counted each model separately. We report outcomes as stated in the original publications, adjusted where possible to utilize uniform metrics (e.g. by converting a reported percentage of patients lost to follow-up to the percentage of patients retained). As explained in the search protocol [7], we feared that it would be misleading to conduct aggregate analyses due to the heterogeneity of model designs, participating populations and study settings, even where outcomes themselves were similar. We thus report only the disaggregated results.
We assessed the quality of the cohort studies using the Newcastle-Ottawa scale [18,19]. The quality rating covered a review of selection, comparability and outcome domains and generated a score out of 9. There are no standardized quality rating categories, but to simplify interpretation of scores, those studies that scored 7 or above were categorized as high quality, those scoring between 4 and 6 were of moderate quality, and those scoring below 4 were considered low quality, as done in previous studies [19]. Randomized controlled trials were assessed using the Cochrane Collaboration's tool for assessing the risk of bias in cluster randomized controlled trials [20]. We assessed sequence generation, participant recruitment with respect to randomization timing, deviation from intended intervention, completeness of outcome data for each main outcome, bias in the measurement of outcome, bias in the selection of the reported result. The risk of bias assessment for the one remaining cross-sectional study was not conducted [21].

| Sources identified
The results of the systematic search are shown in Figure 1. A total of 3,498 non-duplicate abstracts of peer-reviewed journal articles and 12,822 abstracts from the selected conferences were screened. After the initial title and abstract review, 16,092 articles and abstracts were excluded, leaving 228 documents for full review. During the full review, an additional 181 were excluded. Reasons for exclusions are reported in Table S3. The primary reason (60%) for excluding articles was date: most or all of the underlying data were collected prior to 2016. The main reason for excluding conference abstracts (33%) was insufficient information to adequately describe the model and at least one of the outcomes of interest.
Nine peer-reviewed articles and 38 conference abstracts (47 total) were retained in the final data set for the full review. Of these, 29 included one or more clinical outcomes and were included in the analysis reported here. Three quarters of these sources (76%) reported observational cohort studies; most of the rest (21%) were randomized trials. South Africa (27%) and Zambia (22%) jointly accounted for nearly half the sample (Table S3).

| Differentiated models included in the review
The 29 sources described outcomes for a total of 37 discrete differentiated service delivery models, excluding conventional care models for comparison. Models are described briefly in Table 1 below and in full in Table S3. In the tables, each model is assigned a model identifier (ID), which is used to reference that model throughout the review. If a source document (article or abstract) reports on more than one DSD model, multiple model IDs will be associated with it in Table 2. Each model identifier contains an acronym for the model category (FBIM, OFBIM, CLG or HCWLG) followed by a number. For example client-led groups have model IDs CLG1 through CLG5, indicating that there were five distinct CLG models identified. In one instance (HCWLG11), the same model is referred to in more than one source document [22,23].
In addition to the models listed in Table 1, 11 source documents reported comparative results for a conventional care model, creating a total of 48 model-instances with clinical outcomes included in this review (37 DSD + 11 conventional models). Out-of-facility-based individual models (32%) and healthcare worker-led group models (35%) were the most commonly reported categories (Table S5).
Three quarters (76%) of the models were limited to clinically stable patients, and most (59%) were for adults ( Table 1). Definitions of stability varied. Some models required prior evidence of viral suppression, whereas others relied on clinical condition, for example and minimum duration on ART prior to model entry. Details of how a stable patient is defined are presented elsewhere [50].
Additional model characteristics are described in Table S6. Most models provided basic clinical care, antiretroviral medications (ARVs) and laboratory monitoring only (78%). Almost half (46%) included services delivered both in the clinic and in the community, rather than solely one or the other. For those that identified clinical care and pharmacy refill providers, nearly all clinical care (96%) was provided by trained clinicians, though few sources specified the clinical cadre involved; more than two-thirds of medication refills (70%) were provided by non-clinician staff (community health workers, designated patients or lay counsellors). More than half the models (57%) required patients to have a total of four to eight clinic visits or DSD model interactions per year; most of the rest required more than eight visits or interactions per year, though a few (18%) were structured for three or fewer per year (Table S6). Models that are focused on adolescents and children are often more intensive than those for adults, which could inflate the average frequency estimated here. As only one model in our review was aimed at adolescents, however, it is unlikely that this had a substantial impact on this estimate [47].

| Outcomes
A total of 55 outcomes were reported for the 37 models included in the review (Table 2). Retention in care was the most common, reported for 78% of the models. Just over half the models (59%) reported viral suppression.
Quantitative results for each study are shown in Table 3. Some studies included effect sizes in comparison with conventional care, whereas others did not provide comparison values at all, but simply reported the outcomes of the DSD models. Table 4 provides additional information, including effect sizes, for studies that did report these measures. More detailed versions of both tables, including any estimates or calculations by the authors, can be found in Table S7.

| Retention in care
Although retention in care was the most commonly reported outcome, only a few sources provided a comparison to conventional care. For those that did, retention in the DSD model was generally within 5% of that in conventional care, with the exception of a healthcare worker-led group model in the Democratic Republic of Congo, which greatly improved retention [24]. Among those not providing a comparison, retention generally exceeded 80% (range 47% to 100%). For the few sources (n = 3) which reported retention outcomes with an effect size, effects varied widely, from much better than conventional care to somewhat worse.

| Viral load suppression
Among the 22 models that reported viral load suppression, ten included a comparison with conventional care (including one that reported only an effect estimate and not actual values). All those with a comparison reported a small increase in suppression in the DSD model. Reported suppression exceeded 90% (range 77% to 98%) in 11/21 models. Five models reported viral suppression with an effect size estimate. Three of these found no difference in suppression when adjusting for baseline differences. Streamlined care in Uganda  The authors used associated documents (e.g. published study protocols, unpublished reports) relevant to these source documents to supplement the DSD model description, if insufficient detail was provided in the publication itself; c Sample sizes pertain to the entire study population rather than for a specific DSD model. For publications that evaluated different DSD models in each arm, we report the total N for the study cohort rather than the N in each study arm; d For most models, stable was defined per national guidelines, though clinicians used clinical criteria to define stability when necessary laboratory tests were not available.   and Kenya [25] and CAGs in Mozambique [40] both reported approximately 15% (prevalence ratio = 1.15 and unadjusted odds ratio = 1.16 respectively) improvements in suppression.

| Adherence and prescription refill rates
Few sources (n = 4) used adherence to ARVs or prescription refill rates as outcomes; results are shown in Table 3. Rates of adherence (n = 1) and prescription refill (n = 3) were >90% (range 92% to 100%) across the models. Only two reported a comparison with conventional care and the DSD model outperformed conventional care in both instances. No effect sizes were reported for adherence or prescription refill measures.

| Quality of evidence
Among the three-quarters of the sources included that were cohort studies and thus evaluated on the Newcastle-Ottawa scale, the quality of the evidence was generally low to moderate (Table S8). Only two of the 22 cohort studies received a score of 7 points (high quality) on the 9-point scale. The relatively low quality of evidence among cohort studies was due mainly to the absence of comparators in many of the studies and the scarcity of detail found in conference abstracts. Most of the remaining studies (n = 6) were randomized controlled trials, for which we assessed quality using the Cochrane Collaboration's tool for assessing risk of bias cluster randomized trials (Table S9)[20]. All three full-length articles (four models) were at low risk for bias [23,25,35] but a concern about bias applied to the two abstracts, driven mainly by the fact that the conference abstracts did not contain full information on study methodology [39,48].

| DISCUSSION
We systematically reviewed and synthesized the current evidence related to clinical outcomes of differentiated service delivery models for HIV treatment in sub-Saharan Africa between 2016 and 2019. While we identified 29 sources that described one or more clinical outcomes of 37 DSD models in 11 countries, only a minority (28%) compared the alternative models to conventional care or to one another, making it difficult to draw strong conclusions about the overall impact of DSD models on clinical outcomes. Because of the heterogeneity of outcome definitions and timing and the highly variable quality, size and scope of the studies included, we opted to present outcomes individually for each model, stratified by model category and outcome, rather than to estimate aggregate statistics. For those models that did provide a comparison with conventional care, retention in care in DSD models was generally within 5% of that in conventional care, with a few exceptions that reported much better retention. Similarly, viral suppression was generally equivalent or slightly higher in the DSD models. We did not expect to see a marked improvement in clinical impact (retention or viral suppression) because most DSD models are limited to already-stable patients, for whom outcomes can be sustained but cannot improve. Where comparisons with conventional care were provided and effect sizes reported, effects on retention and suppression varied widely, from slightly worse than conventional care to moderately better. In general, DSD models were not associated with a meaningful deterioration in patient outcomes, despite in many cases having fewer interactions with patients or relying on lower cadres of clinicians than did conventional care. These clinical indicators, while capturing the direct health benefit of DSD models, do not reflect patient experience of the model. The limited available qualitative data on patient satisfaction identified as part of the rapid systematic review have been reported elsewhere [51]. As is evident from the discussion earlier, this review had many limitations. While we believe that our search of the peerreviewed, published literature and abstracts was thorough, the lack of standard terminology for describing DSD models hampered the creation of precise search strings, and it is possible that some sources were missed. Most sources did not describe procedures for recruiting patients into DSD models, but it is possible that self-and provider-selection biased participation towards the most motivated and empowered patients, among all those who met formal eligibility criteria. More important, the extreme heterogeneity of the sources that did meet inclusion criteria rendered any attempt to aggregate results or produce summary statistics misleading. This heterogeneity manifested itself in multiple ways. The topic of DSD models is highly diverse in itself. Evaluation methods ranged from singlesite, single-arm observational cohorts to large randomized trials. The majority of sources did not provide comparisons with conventional care, and metrics for assessing outcomes varied widely and were in many cases poorly defined. The underlying patient populations were often poorly described, without disaggregation by age or sex, or were by design different by model even within countries. Finally, with the exception of the randomized trials that included a standard of care arm, outcomes reported reflect only what is happening with patients eligible for the DSD models, who in most cases were already stable on ART. By definition, these models increase the proportion of ineligible patients remaining in conventional care, whose outcomes may be worse. The outcomes reported for specific DSD models can thus not be regarded as overall ART programme outcomes.
Stemming from these limitations, the search reported here identifies gaps in the evidence base and research priorities for DSD model implementation in the coming years. In particular, rigorous evaluation of clinical outcomes, with relevant comparisons, is needed if we are to fully understand the implications of DSD models for HIV control. Longer term follow-up under routine care settings, beyond the first 12 or 24 months, should be undertaken, as it is critical to know what happens to retention and viral suppression three, five, or ten years after entry into a DSD model. This is especially important when DSD models are focused on stable patients and large changes in treatment outcomes are unlikely in the short term. Evaluation reports on the outcomes of DSD models should consistently include a description of the population served, as models limited to already-stable patients are likely to have different outcomes from those that enrol a cross section of the ART patient population. Wherever possible, evaluations should include an entire ART population (patients eligible for and not eligible for DSD models; patients enrolled or not enrolled in the models), so that overall treatment programme outcomes can be estimated, rather than only those for patients in the models. Finally, there is also a need for electronic medical record systems to evolve to capture data on DSD model participation, as this is an essential step towards understanding the true clinical and other impacts of DSD models.

| CONCLUSIONS
We note that there is a difference between the clinical outcomes of the patient enrolled in DSD models and the "impact" of implementing DSD models as part of national HIV programmes. In many of the studies included in this review, only a small proportion of eligible patients were enrolled in a DSD model, and only those patients' outcomes reported. The effect of those patients' outcomes on the overall, aggregate outcomes of the healthcare facilities at which the DSD models were implemented may have been modest, or even trivial, if large numbers of other patients remained in conventional care. Future evaluations of the outcomes of DSD models would be of greater value if they considered the entire, relevant patient population-for example all the ART patients served by a facility, or all the ART patients in a catchment area-as the denominator for assessing success.

C O M P E T I N G I N T E R E S T S
The authors declare that they have no conflicting interests. LL, SK and SR conceived of and designed the study. SK, SP, BEN, RC, CG and AH identified and reviewed sources and extracted data. LL, SK and SR analysed the data and drafted the manuscript. DF contributed to design and data collection. MF contributed to design and data analysis. All authors reviewed and edited the manuscript.

A C K N O W L E D G E M E N T S
None declared.

F U N D I N G
Funding for the study was provided by the Bill & Melinda Gates Foundation through OPP1192640 to Boston University. The funder had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

SUPPORTING INFORMATION
Additional information may be found under the Supporting Information tab for this article. Table S1. Inclusion/exclusion criteria for publications and abstracts Table S2. Search strategy Table S3. Reasons for exclusions after full-text review Table S4. Original source documents and models Table S5. Characteristics of source documents Table S6. Characteristics of service delivery models Table S7. Details of treatment outcomes reported in the source documents Table S8. Risk of bias assessment for cohort studies (Newcastle-Ottawa Scale) Table S9. Risk of bias assessment for cluster randomized trials (Cochrane Collaboration's tool)