The temporal increase in HIV-1 non-R5 tropism frequency among newly diagnosed patients from northern Poland is associated with clustered transmissions

Introduction CCR5 (R5) tropic viruses are associated with early stages of infection, whereas CXCR4 (X4) HIV-1 tropism has been associated with severe immunodeficiency. We investigated the temporal changes in the genotype-predicted tropism frequency and the phylogenetic relationships between the R5 and non-R5 clades. Methods A cohort of 194 patients with a newly diagnosed HIV infection that was linked to their care from 2007 to 2014 was analyzed. Baseline plasma samples were used to assess the HIV-1 genotypic tropism with triplicate V3-loop sequencing. The non-R5 tropism prediction thresholds were assigned using a false positive rate (FPR) of 10 and 5.75% and associated with clinical and laboratory data. The transmission clusters were analyzed using pol sequences with a maximum likelihood and Bayesian inference. Results The overall non-R5 tropism frequency for 5.75% FPR was 15.5% (n=30) and 27.8% (n=54) for 10% FPR. The frequency of the non-R5 tropism that was predicted using 5.75% FPR increased significantly from 2007 (0%) to 2014 (n=5/17, 29.4%) (p=0.004, rough slope +3.73%/year) and from 0% (2007) to 35.3% (2014, n=6/17) (p=0.071, rough slope +2.9%/year) using 10% FPR. Increase in the asymptomatic diagnoses over time was noted (p=0.05, rough slope +3.53%/year) along with a tendency to increase the lymphocyte CD4 nadir (p=0.069). Thirty-two clusters were identified, and non-R5 tropic viruses were found for 26 (30.95%) sequences contained within 14 (43.8%) clusters. Non-R5 tropism was associated with subtype D variants (p=0.0001) and the presence of CCR5 Δ32/wt genotype (p=0.052). Conclusions R5 tropism predominates among the treatment of naive individuals, but the increases in the frequency of non-R5 tropic variants may limit the clinical efficacy of the co-receptor inhibitors. The rising prevalence of non-R5 HIV-1 may indicate transmission of X4 clades.


Introduction
HIV-1 entry into the target cell requires the use of the CCR5 or CXCR4 co-receptor [1]. Circulating HIV clades may exhibit tropism to one or both of the co-receptors and are classified as R5, X4, or dual/mixed (D/M) tropic for the CCR5, CXCR4, or D/M tropic variants, respectively [2]. As R5 tropic variants predominate in the primary and early stages of infection, previous reports suggested the possible preferential transmission of CCR5 utilizing variants, with the mucosal barrier acting as a factor driving this genetic bottleneck [3]. With the progression of HIV disease, the frequency of non-R5 clades increases and reaches approximately 50% [4]. X4 viruses have been associated with faster lymphocyte CD4 decline compared with the R5 clades, but may also reflect delayed HIV diagnosis [5].
With the introduction of CCR5 inhibitors in the clinical practice, sequence-based assays to predict HIV-1 tropism have been developed and validated [6]. Tropism predictions based on the sequencing of the third hypervariable (V3) loop allow clinicians to distinguish between R5 and non-R5 (X4 and D/M) clades using the amino acid charge of the V3 region and various prediction algorithms, of which geno2pheno is the most popular [7]. Genotypic tropism predictions allow clinicians to not only select patients that may be susceptible to CCR5 inhibitors but also to screen large cohorts to observe the clinical and genetic characteristics associated with tropism and to analyze the spread of R5 and X4 viruses [8]. We have recently identified an association between the chemokine receptor genetic variants and tropism, with the CX3CR1 rs3732378 A allele being associated with increased prevalence of R5 tropic clades in newly diagnosed individuals [9]. In this study, we aimed to investigate the temporal changes in tropism frequency in the same dataset supplemented with 2013 and 2014 data and analyze the phylogenetic relationships between the R5 and X4 tropic clades using matched reverse transcriptase and protease sequences to characterize transmission events and clustering.

Study group
For the study data, we analyzed samples from 194 newly diagnosed treatment-naïve patients with confirmed HIV infection linked to care at the Department of Infectious, Tropical Diseases and Acquired Immune Deficiency (Pomeranian Medical University, Szczecin, Poland) and Out Patients' Clinic for Acquired Immunodeficiency (Regional Hospital, Szczecin, Poland) from 2008 to 2014. All patients reporting to the centre with the confirmation test within one year from the first visit date were included in the study. The study protocol was approved by the bioethical committee of Pomeranian Medical University, Szczecin, Poland (approval number KB-0012/08/12 for HIV-1 tropism and BN-001/34/04 for CCR5 D32 genotyping), and consent was obtained from the patients included in the study. Plasma samples collected prior to the introduction of antiretroviral treatment were used for HIV-1 tropism assessment. Whole blood samples were used for DNA extraction and CCR5 D32 genotyping.
The following clinical data were collected: age, sex, date of HIV diagnosis, route of transmission, hepatitis C co-infection, clinical category at diagnosis according to the CDC case definition, baseline HIV viral load, and the baseline and nadir lymphocyte CD4' counts. The baseline lymphocyte CD4 counts were defined as the first documented result after HIV diagnosis.
The CDC category at diagnosis was assumed based on a review of the patient's clinical record. Polymerase chain reaction (PCR) with sequence-specific primers was used to analyze the CCR5 D32 (rs333) variation, according to the previously described PCR methodology [10,11].
HIV-1 tropism assessment V3 loop sequencing was performed according to the methodology provided by the HIV Centre of Excellence (personal communication, prof. R. Harrigan) from the first collected plasma sample of the HIV-1 diagnosed treatmentnaive patients. Briefly, nested PCR was performed after reverse transcription of the extracted HIV-1 RNA. The amplicons were used for sequencing by standard techniques using an ABI 3500 platform (Applied Biosystems, Foster City, CA). Analyses were performed in triplicate. Two overlapping sequencing reactions (forward and reverse) were performed for each sample. Standardized operating procedures were used to ensure lack of contamination within the sequences; RNA extraction, reverse transcription and amplification were performed in the separated dedicated laboratory rooms. The sequences were assembled using the Recall online tool (www.pssm.cfenet.ubc.ca), which also provided sequencebased HIV-1 tropism interpretations using the geno2pheno algorithm [12]. The same version of tropism interpretation algorithm was used for all the sequences included into the study. Non-R5 tropism prediction thresholds were assigned using a false positive rate (FPR) of 10% as defined by the European Guidelines on HIV-1 tropism testing [6] and 5.75% FPR as defined by the MERIT and MOTIVATE trials [13]. Across the triplicates, discordant tropism results were found in seven (3.6%) cases for FPR 10% and two (1%) for FPR 5.75%.

Subtyping and phylogenetic analyses
Bayesian inference was used to analyze the phylogenetic relationships between the sequences with a similar predicted tropism. As the tropism sequences were short and triplicate testing confounds the phylogenetic analysis, we used the 1302 bp (HXB2 genome location 2253 to 3525) protease/reverse transcriptase sequences obtained by standard Viroseq 2.8 genotyping of the same samples. As one sequence was notably shorter ( B1000 bp long), 193 sequences were included for the phylogenetic analyses. First, the sequences were aligned with Clustal X2.0.10 (www.clustal.org) software [14]. A GTR model with empirical nucleotide frequencies was selected with the jModeltest 2.1.1 software [15]. The nucleotide frequencies calculated under this model were as follows: freqA 00.4179, freqC 00.1598, freqG 00.1985, freqT 0 0.2238, gamma shape parameter 00.903 and p-inv00.462.
To investigate the existence of clusters with similar tropism, the maximum likelihood (ML) method with the NNI-SPR sub-tree algorithm and the GTR model with the PHYML v 3.0 software online web server were used to compute evolutionary distances and calculate the aLRT values [16]. In addition, we used a Bayesian Monte Carlo Marcov Chain (MCMC) for the analyses. Two replicates of 100 million generations were run in BEAST v. 1.5.3 [17] using a constant population size and a GTR'G with uncorrelated lognormal relaxed molecular clock. All prior and posterior effective sample size values exceeded 200 [18]. Clustering was assessed using Cluster Picker software with the maximum genetic distances calculated by the program. Clusters were assigned with the aRLT value !90% for the ML method and a posterior value !95% for Bayesian inference; in both cases, maximum pair-wise distances B0.045 were used [19]. A consensus tree with posterior probabilities for branch support was obtained and annotated with TreeAnnotator v 1.5.4. All trees were visualized in Figtree v.1.2.2.
To evaluate number of recent infections in the sample fraction of ambiguous nucleotides in partial pol sequences was used; !0.5% ambiguity cutoff value suggestive of the recent infection was implemented, according to the previous study [20].

Statistical analyses
Fisher's exact and chi-square tests were used for the nominal variables, and U-Mann Whitney/ANOVA tests were used for the continuous variables (Statistica PL 8.0, Statasoft, Poland). Time trends were examined using logistic regression (R statistical platform, v. 3.1.0) for the binary variables and linear regression (Statistica PL 8.0, Statasoft, Poland) for the continuous variables. To validate the results, we have calculated the power of the sample sizes based on the assumption that the population size in the region for the years 2008 to 2014 was 500 cases (total number of newly diagnosed cases followed up in the centre increased by the coefficient of 30% (estimated percentage of undiagnosed HIV infections in Poland)). Based on the observed tropism frequencies, for the FPR 5.75%, the 95% CI sample size was 168 cases providing 4.57% margin of error, whereas for the FPR 10%, the 95% CI sample size was 191 cases, providing 4.94% margin of error.

Tropism clustering
We constructed a phylogenetic tree containing 193 pol sequences corresponding to the samples used for the tropism assessment. The tropism as well as CCR5 D32 genotype were identified for every tip in the phylogenetic tree. In total,  (Figure 3). It should be noted that 28 sequences were obtained from known partners and, therefore, most clusters containing only 2 isolates are pairs. Fourteen (43.8%) clusters contained 26 (30.95%) non-R5 tropic clades (FPR B10%). Of these, 4 clusters contained only non-R5 sequences, and both non-R5 and R5 tropic viruses were found in 10 clusters. A six-sequence cluster (marked with # on Figure 3) contained five non-R5 tropic clades (three injection drug users and two female sexual partners, may indicate a transmission network). The frequency of the pure non-R5 clades was more common within the clusters of the non-B (subtype D) variants (p 00.0001) and among patients with the CCR5 D32/wt genotype (p00.052) ( Table 2). The clusters with non-R5 sequences only were the most common among heterosexually infected cases compared to the MSM and IDU cases analyzed separately (p 00.008 and p 00.001, respectively) and to heterosexual vs. MSM vs. IDU (p00.02).

Discussion
The development of V3 sequence-based prediction algorithms for genotypic tropism assessment allowed us to not only test prior to CCR5 inhibitor introduction but also to investigate the influence of tropism on the clinical characteristics of HIV-positive patients [21Á23]. It has been shown that the presence of X4/dual mixed viruses is associated with a more rapid progression of the infection and CD4 lymphocyte loss [5]. Transmission of the X4 variants was rare in recent seroconverters [3,5], and it was suggested that R5 viruses are preferentially transmitted in both sexual and parenteral infections [1,24]. However, the data on the existence of a mucosal bottleneck that limits the propagation of X4 viruses are contradictory. Linked transmissions of the X4 viruses were found among recently diagnosed HIV patients, which suggested random R5 and X4 variant transmission [8]. However, this finding was contrasted by data from Frang et al., who showed that clustered X4 variant transmission was not observed among primary HIV-1-infected individuals [25].
Overall, the frequency of the non-R5 tropism differs across cohorts and ranges from 1.4 to 19% for primary infections [3,25,26] to 9.1Á38% [4,8] in newly diagnosed cases, which is in accordance with the 27.8% (FPR10%) frequency of non-R5 variants identified in our study. Published studies indicate that increased non-R5 tropism frequency observed across the cohorts were associated with the higher number of late diagnosed patients and longer duration of the infection, lower baseline lymphocyte CD4 count and HIV subtype [3,4,8,25,26]. In our study, we found an increasing temporal trend for the frequency of the non-R5 tropism among newly diagnosed, treatment-naïve patients and clustered non-R5 variants. This trend is not likely to be associated with the delayed HIV diagnosis, as the number of patients with asymptomatic infection at diagnosis increased over time, and increases in the CD4 count at nadir and the number of AIDS cases were stable.
Phylogenetic reconstructions indicate the possibility of the circulation of the non-R5 tropic viruses with clusters observed across all exposure groups, but they were most common among heterosexually infected cases. Moreover, the possibility of the sexual spread of the non-R5 variants is important in light of the decline of injection drug use-related transmissions. These findings are in accordance with the data presented by Chalmet et al., who found onward transmissions of X4 or D/M tropic viruses and the presence of X4 tropic viruses in 11% of the transmission clusters [8]. Furthermore, the spread of non-R5 tropic viruses was confirmed in 31% of well-characterized transmission pairs. This study allowed for the development of the ''random transmission hypothesis,'' which challenges the belief that R5 tropic viruses are selected during HIV-1 transmission [27]. Our study confirms the stochasticity of tropism spread, which results in the increase in the circulation of non-R5 tropic viruses and argues against the genetic bottleneck hypothesis and preferential infections with CCR5 utilizing variants. The temporal increase in the circulation of X4 tropic viruses, from 11.5% before 2001 to 23.3% in 2010 to 2012, was also found by Sierra-Enguita et al. in the cohort of the recent seroconverters from Spain [28]. As a majority of infections were related to the MSM contact, the sexual spread of X4 tropic viruses may be similar to that observed in our study. Identification of the increased proportion of the non-R5 tropic viruses may also be related to the high HIV genetic diversity in our cohort and the duration of infection. Previously, a higher frequency of genotype-predicted X4 tropism was found in subtype D and CRF01_AE infections [8,29]. This is in accordance with the data from our study, with more common non-R5 sequence clusters among the cases infected with subtype D. In fact, two sequence pairs from heterosexually infected cases contained V3 sequences with FPR B5%, indicating a possible transmission of subtype D non-R5 viruses. However, it is also possible that these heterosexually infected cases represent late testers with a higher likelihood of the R5 to non-R5 tropism switch. In Poland, there is only limited data on the viral relationship and frequency of clustering. Recently, in the multicentre sample of 833 antiretroviral treatment-naive individuals, we have identified that 20.9% of the subtype B sequences are transmission pairs and 23.7% are clusters ]3 sequences, which indicates that the frequency of sequence clustering is similar across the country [30].
The higher frequency of the CCR5 D32/wt genotype in pure non-R5 clusters may reflect a loss of the beneficial effect in cases of exposure to X4/dual mixed tropic HIV-1. In the study by Brumme et al., individuals with CCR5 D32/wt heterozygote are at a significantly (2.5 times) higher risk of harbouring of the X4 variants [31]. It must be noted, however, that a preferential selection of the X4 variants in CCR5 D32/wt cases is also possible; to date, there is no conclusive data on whether the presence of the D32 allele is associated with lower susceptibility to R5 infections or favours a switch to CXCR4 using clades.
Our study has the following limitations: first, the recently diagnosed HIV-1 positive patients were analyzed, but the data on the duration of infection were not available. It was also impossible to distinguish between the acute and chronic HIV infections. Our analysis suggests the clustered transmission  of the non-R5 variants; however, data from recent seroconverters supplemented with donor sequences would have to be used to exclude the possibility of the tropism evolution. A country-wide study would strengthen the conclusion on the possibility of the spread of the non-R5 tropic viruses in Central Europe. Second, there is only limited data for genotypic tropism assessments in non-B variants with the possible misinterpretation of genotyping tropism especially in the subtype D [29]. It was previously shown that subtype D specificity for non-R5 predictions was lower in comparison to subtype B [32]. Therefore, the tropism findings should be interpreted with caution [21,33].
In conclusion, we found that the R5 tropism predominates among the treatment-naive individuals, but the increase in the frequency of non-R5 tropic variants may limit the clinical efficacy of the co-receptor inhibitors in the analyzed local population, which should be confirmed throughout the country. Increased prevalence of non-R5 HIV-1 may be related to the transmission of X4 non clades; however, this conclusion should be interpreted with caution. Circulation of the non-R5 tropic variants among treatment-naive patients is not only important for the loss of susceptibility to maraviroc but also for the possibility of the spread of the more rapidly progressing viral strains. The clusters were based on the maximum likelihood estimated distances from the corresponding partial pol sequences and assigned by Cluster Picker software with a ]90% aLRT value, a 4.5% maximum genetic distance and verified using Bayesian inference in BEAST with a posterior probability ]95%. a ANOVA test; b Fisher's exact test, two-tailed; HET Á heterosexual transmission; MSM Á men having sex with men; IDU Á intravenous drug use.