Noninvasive western lowland gorilla's health monitoring: A decade of simian immunodeficiency virus surveillance in southern Cameroon

Abstract Simian immunodeficiency virus (SIVgor) causes persistent infection in critically endangered western lowland gorillas (Gorilla gorilla gorilla) from west central Africa. SIVgor is closely related to chimpanzee and human immunodeficiency viruses (SIVcpz and HIV‐1, respectively). We established a noninvasive method that does not interfere with gorillas' natural behaviour to provide wildlife pathogen surveillance and health monitoring for conservation. A total of 1,665 geo‐referenced fecal samples were collected at regular intervals from February 2006 to December 2014 (123 sampling days) in the Campo‐Ma'an National Park (southwest Cameroon). Host genotyping was performed using microsatellite markers, SIVgor infection was identified by serology and genetic amplification was attempted on seropositive individuals. We identified at least 125 distinct gorillas, 50 were resampled (observed 3.5 times in average) and 38 were SIVgor+ (seven individuals were seroconverters). Six groups of gorillas were identified based on the overlapping occurrence of individuals with apparent high rates of gene flow. We obtained SIVgor genetic sequences from 25 of 38 seropositive genotyped gorillas and showed that the virus follows exponential growth dynamics under a strict molecular clock. Different groups shared SIVgor lineages demonstrating intergroup viral spread and recapture of positive individuals illustrated intra‐host viral evolution. Relatedness and relationship genetic analysis of gorillas together with Bayesian phylogenetic inference of SIVgor provided evidence suggestive of vertical transmission. In conclusion, we provided insights into gorilla social dynamics and SIVgor evolution and emphasized the utility of noninvasive sampling to study wildlife health populations. These findings contribute to prospective planning for better monitoring and conservation.


| INTRODUC TI ON
Simian immunodeficiency viruses (SIVs) have been isolated from many wild African non-human primates species, each infected with a species-specific SIV Sharp & Hahn, 2011). The closest simian relatives of HIV-1 are SIVcpz from chimpanzees and SIVgor from gorillas, which each have been transmitted to humans in at least two occasions: SIVcpz strains from central chimpanzees (Pan troglodytes troglodytes) in southern Cameroon are the ancestors of the pandemic HIV-1 group M and the geographically restricted group N, whereas SIVgor strains from gorillas in Cameroon are the ancestors of HIV-1 group O and P, which each are also restricted to that country (D'Arc et al., 2015;Keele et al., 2006;Sharp & Hahn, 2011;Van Heuverswyn et al., 2006, 2007. In turn, SIVgor originated after a cross-species transmission of SIVcpz (D'Arc et al., 2015;Takehisa et al., 2009). Infection with SIVcpz is common in central and eastern chimpanzees (P.t. schweinfurthii) that are distributed in central Africa countries, but SIVgor is restricted to critically endangered western lowland gorillas (Gorilla gorilla gorilla) from southern Cameroon (D'Arc et al., 2015;Keele et al., 2006;Li et al., 2012;Neel et al., 2010;Santiago et al., 2003;Van Heuverswyn et al., 2007). It is unclear whether the current distribution of SIVgor is related to a recent cross-species event, geographic barriers or host population decline due to pathogenicity. Previous studies on habituated wild eastern chimpanzees in Tanzania showed that SIVcpz infection has a negative impact on the health, reproduction, and survival of chimpanzees and caused the decline of one community (Keele et al., 2009;Rudicell et al., 2010;Terio et al., 2011). AIDSrelated symptoms have also been observed in a naturally infected central chimpanzee from Cameroon (Etienne et al., 2011). Studies on the impact of SIVgor infection are scarce but they are much needed in order to understand gorilla population dynamics.
The integration of research on infectious diseases and primate ecology provides not only benefits for gorillas' conservation but also provides insights into the role of disease in primate evolution and into the emergence of disease in humans (Leendertz et al., 2006). A major issue to monitor health in western lowland gorillas is the lack of habituated SIVgor infected populations. Nonetheless, there is the possibility of acquiring indirect information related to SIVgor infection on a cross-sectional basis using noninvasive sampling. During previous surveys conducted between 2004 and 2011, we identified several SIVgor infected gorillas in the Campo Ma'an National Park in southwest Cameroon (Neel et al., 2010). We demonstrated the feasibility of long-term monitoring of SIVgor infection in nonhabituated gorillas and we showed for the first time that it is possible to re-sample gorillas in a noninvasive way and to characterize SIVgor diversity (Etienne et al., 2012). Here we included additional 4 years of follow-up, inferred groups of gorillas using geo-referenced data over time and performed genetic and phylogenetic analyses using state-of-the-art approaches.
We sought to advance our understanding of SIVgor transmission and evolution among gorillas in the Campo-Ma'an National Park and to further validate a noninvasive approach to better monitor and inform the conservation of this critically endangered and elusive species.

| Study site and sample collection
The Campo-Ma'an National Park is located in the South Western corner of Cameroon, bordering on Equatorial Guinea to the south and the Atlantic Ocean to the west covering an area of approximately 2,640 km 2 . This study was conducted in an area of approximately 200 km 2 at the center of the park where we previously characterized SIVgor infected gorillas (Etienne et al., 2012). Opportunistic collection of fecal samples occurred between 07:00 and 17:00 hr. Field data included sample's GPS coordinates, location type (nest, feeding site or directly on track), estimated time of fecal deposition and, since 2012, the hour of collection. About 20 g of fecal material was collected in a 50-ml tube, containing 20 mL of RNAlater ™ (Ambion, Austin, TX, USA), and kept at ambient temperature, for a maximum of 2 weeks, and then stored frozen.

| Detection of SIVgor antibodies and viral RNA in fecal samples
The fecal samples were tested with the INNO-LIA ™ HIV I/II score (Innogenetics, Ghent, Belgium) to survey the presence of crossreactive antibodies against HIV-1. Several independent RNA extractions were performed for each specimen with reactive antibodies, followed by multiple RT-PCR attempts to increase the chance of SIVgor genetic amplification. Total nucleic acids were extracted with the NucliSens system (Biomérieux, Craponne, France), and RT-PCR reactions were performed using SIVgor and SIVgor/ SIVcpz/HIV-1 consensus primers that targeted the env (200,315 or 440 bp of the gp41 transmembrane envelope protein's ectodomain) and pol (245 or 330 bp of the integrase) regions as previously described (D'Arc et al., 2015;Etienne et al., 2012;Neel et al., 2010). cDNA were generated with Expand reverse transcriptase and Expand long-template PCR system (Roche Diagnostics, Indianapolis, IN, USA) according to the manufacturer's instructions. For each RT reaction, we used 10 μl of fecal viral RNA extract that we first incubated with 40 pmol of the outer reverse R1 primer for 10 min at 65°C, rapidly cooled on ice. We then added this mixture to the RT-PCR components, including 20 U of RNase inhibitor (Applied Biosystems/Ambion), in a final volume of 20 μl.
The mix was finally incubated for 60 min at 42°C, followed by 5 min at 95°C to inactivate the enzyme. Ten (10) μl of the reversetranscribed RNA was used for downstream first round PCR using F1 and R1 primers. For the second round of the nested PCR, we used 5 μl of the first round PCR product in a final reaction volume of 50 μl. PCR amplifications included 45 cycles of denaturation (94°C, 20 s), annealing (45-55°C, depending on the primer Tm for 30 s), and elongation (68°C, 1 min) in a thermal cycler. PCR products were purified and directly sequenced using an automated sequencer (3,130 × l or 3,150 Genetic Analyser; Applied Biosystems, Foster City, CA, USA).

| Genetic characterization of Gorillas
Sex determination was performed using PCR products generated from the amelogenin gene that contains a deletion in the X, but not in the Y chromosome as previously described (Etienne et al., 2012).
Host genotyping was performed using seven microsatellite loci (D18S536, D4S243, D10S676, D9S922 D2S1326, D2S1333, and D4S1627). To minimize allelic dropout, three to seven amplications were performed on homozygous loci. When PCR reactions yielded poor results, a new set of PCRs was performed on a new fecal nucleic acids extract. Samples that did not provide any successful result after five PCR attempts and two independent DNA extractions were discarded and considered as degraded. Samples with an incomplete allelic profile (less than four loci) or mixed profile were also discarded from further analyses.
Seven additional microsatellite loci (vWF, D7s817, D7s2204, D16s2624, D8s1106, D10s1432, and D1s550) were obtained from at least one nucleic acids extract of each gorilla to improve the estimation of relatedness and relationship among the different individuals.
Genetic diversity was quantified by estimating observed and expected heterozygosis. Test for Hardy-Weinberg equilibrium (HWE) for each locus and test for linkage disequilibrium between loci were performed using the package adegenet (Jombart, 2008;Jombart & Ahmed, 2011).
A Minimum Spanning Network (MSN) of microsatellite haplotypes was constructed using the Prevosti (Prevosti, Ocana, & Alonso, 1975) and the Bruvo's distance (Bruvo, Michiels, D'Souza, & Schulenburg, 2004) for its ability to handle missing data and that were included in the package (Kamvar, Brooks, & Grunwald, 2015;Kamvar, Tabima, & Grunwald, 2014). The relatedness value (r d ) was calculated using the Wang estimator (Wang, 2017;Wang, 2002) included in the package Demerelate (Kraemer & Gerlach, 2017), the Maximum Likelihood (ML) approach implemented in the MLrelate software (Kalinowski, Wagner, & Taper, 2006), and Bayesian estimation using the package SOLOMON (Christie, Tennessen, & Blouin, 2013). The Wang estimator was chosen because it is robust for small samples sizes and applies to any number of alleles per locus and to any allele frequency distribution (J. Wang, 2017; J. L. Wang, 2002). The Bayesian parentage analysis requires the prespecification of adults and offspring genotypes. To account for this, we randomly distributed our data in two groups and then performed the analysis. We repeated this approach ten times and preferred the pairs that were hypothesized as parent-offspring in any of them; a cutoff value of 0.05 was chosen for the posterior probability of a pair being false given frequencies of shared alleles.

| Clustering of individuals to reconstruct gorilla groups
At every sampling day, the sample's geographical coordinates were used to trace individuals and to cluster them. We used distance values ranging from 10 to 500 m. We used the estimated time of fecal deposition to more accurately trace the occurrence of gorillas over time (e.g., an estimated fecal deposition of 24 hr changed an individual's occurrence to the preceding day). The clustering of gorillas was performed following a stepwise procedure coded in R (R Core Team, 2014). Gorillas were assigned to the same group when their traced ranges overlapped at a given time-point. Each time an individual with a membership was recaptured, any other individual observed together was added to the corresponding group. The outcomes of the clustering algorithm were branching diagrams for every sampling day (which were generated using the geodesic distance and distance-matrix methods) and the group's identity of each gorilla using the prespecified distance cutoff. We further analyzed the demarcation of groups over time and scrutinized those cases where two groups merged by examining the corresponding field notes and excluding observations in ensuing sensitivity analysis.

| SIV Phylogenetic analysis
The novel and previous SIVgor genetic sequences were used in reconstruction of phylogenetic trees. When multiple sequence reads from the same individual and the same day were available, consensus sequences were generated and used in time-scaled evolutionary analyses following Markov chain Monte Carlo sampling as implemented in BEAST v1.8.3 software package (Drummond, Suchard, Xie, & Rambaut, 2012). Reconstructions were performed using both genetic regions. Regressions of root-to-tip genetic distance against sampling time for each of the datasets were performed using TempEst v1.5.1 (Rambaut, Lam, de Carvalho, & Pybus, 2016) and trees constructed with maximum-likelihood algorithms in FastTree 2.1 (Price, Dehal, & Arkin, 2009. All datasets exhibited positive correlation between genetic divergence and sampling time and appeared to be suitable for phylogenetic molecular clock analysis (correlation coefficient of 0.29 and 0.14 for gp41 and pol, respectively). The datasets were analyzed using a HKY+4Γ (Hasegawa, Kishino, & Yano, 1985) nucleotide substitution model plus a demographic coalescent tree prior model (among the parametric constant size, exponential or logistic growth models and the nonparametric Skyride one) (Drummond, Rambaut, Shapiro, & Pybus, 2005;Minin, Bloomquist, & Suchard, 2008) and either a molecular strict clock model or a relaxed uncorrelated lognormal one (Drummond, Ho, Phillips, & Rambaut, 2006). Model comparisons were performed using the path sampling and stepping stone approach (A minimum of 50 steps with the length of each chain being at least one million iterations) (Baele et al., 2012). For each dataset, two to four Markov chain Monte Carlo chains of 20-200 million iterations were computed. Samples were combined and diagnosed using visual trace inspection and calculation of effective sample sizes (>200) in Tracer (Rambaut, Drummond, Xie, Baele, & Suchard, 2018).  Figure   S1). A total of 50 gorillas were recaptured and were observed 3.5 times in average (3rd Quartile = 5, Maximum = 9). Table 1 shows that we have in average 8.6 microsatellite alleles per locus. Most loci have <8% missing observations; loci D7s817, D7s2204, and D10s1432 had between 17 and 23 percent missing observations. Overall, loci were highly polymorphic and evenly distributed. Most loci were under the null expectation of Hardy-Weinberg equilibrium (p > 0.05) except for loci vWF, D7s817, D8s1106, D10s1432, and D2S1333 (Table 1). Figure S1 underscores that most individuals were heterozygous for each locus and that loci were not linked (low r d value).    Member of groups A, C, and E was consistently re-sampled over more than 5 years. The samples from individual CPg-001 spanned a decade (2004, 2008, and 2014).

| Reconstruction of gorilla groups over time
The ranges of Groups A and E intersected once during a sampling day in May 2011 as a cause of sample CR6687 (CPg-035). Both groups kept their original identity when this sample was removed in sensitivity analyses. Sample CR6687 was found directly on track rather than a nesting site so its occurrence next to samples from another group may reflect a chance event rather than the existence of a unique larger group.
The ranges of Groups A and E also overlapped once when using a range ≥200 m due to the proximity of individuals in a single sampling day during February 2007. However, both groups kept their distinctiveness when this observation was removed in sensitivity analysis. It is worth mentioning that western lowland gorillas occasionally from "nesting supergroups" where two groups nest at distances of 30-50 m (Bermejo, 2004).

| Interindividual relatedness
A MSN of microsatellites using Prevosti's distance is shown in Figure 3. This MSN shows that genetically close haplotypes did not match the groups that were delineated using the overlapping occurrence of individuals. Similar findings were observed with the MSN of microsatellites using Bruvo's distance (Supporting Information Figure S6). We used three different metrics to assess interindividual's relatedness and we retained the pairs that had the highest estimates. In the Wang's metric, we kept the pairs with an estimated value over 0.6.
In the ML calculations, we kept the pairs that were discriminated as Supporting Information Figure S2 shows the genetic relatedness of individuals as networks for groups A, B, and E. The percentage of individuals than fell within the largest network in each group (a proxy of the family history) ranged between 21% and 48%, whereas the percentage of individuals in each group that were nonrelated ranged between 25% and 69%. In average, the percentage of individuals in each group that were sampled once was of 47% and among them between 64% and 100% did not fell within the largest network.

| SIVgor evolution
A total of 38 WLG were SIVgor+ (18 females  We obtained partial gp41 gene sequences from 25 genotyped gorillas and partial pol gene sequences for 16 of them. The models with the higher log marginal likelihood using either stepping stone sampling or path sampling were a strict molecular clock model and an exponential demographic model (Table S1). In all cases, we obtained "overwhelming" (2 log e Bayes factors [BF] >10) support for strict clocks over relaxed ones (Kass & Raftery, 1995). For gp41, the exponential model support was "Positive" (2 log e BF 2-6) over the remaining parametric models and "Strong" (2 log e 6.7) over the nonparametric Skyride; weak support was observed for pol when comparing the exponential model to the remaining parametric ones (2 log e BF ~ 1.7) and "Positive" (2 log e BF 3.1) when compared to the Skyride. For pol, the inferred time to the most common recent ancestor (TMRCA) was around 1970 (95% HPD interval 1938-1993), the evolutionary rate was 1.27 × 10 −3 (95% HPD interval 5.06 × 10 −4 to 2.06 × 10 −3 ) and the exponential growth rate was 0.039 per year.
For env, the inferred TMRCA was around 1964 (95% HPD interval , the evolutionary rate was 5.02 × 10 −3 (95% HPD interval 2.8 × 10 −3 to 7.1 × 10 −3 ) and the exponential growth rate was 0.033 per year. Concerning the phylogenetic relationships of viral samples from seroconverters Individuals: CPg-041, CPg-066, and CPg-128 grouped with samples from the corresponding same group; CPg-032 grouped with CPg-031 whose group identity was uncertain and their closest relative was the clade containing CPg-001, which belonged to the former's same group; and CPg-036 closest relative belonged to another group. For the two remaining seroconverters (CPg-094 and CPg-115), there were no viral sequence data available.

| SIVgor transmission
In order to assess SIVgor transmission we opted for less stringent constraints-in the interindividual's relatedness analysis-to include more pairs where both individuals were SIVgor positive and had SIVgor genetic data. These constraints were as follows: the pairs that had a Wang estimate over the full siblings threshold (0.43 was the data-inferred relatedness-threshold for full siblings), the pairs that were discriminated as parent-offspring in the ML analysis and where the log likelihood of the individuals being unrelated was at least three times less than the log likelihood of the parent-offspring, and the pairs with a posterior probability below 0.5 in the Bayesian calculations.

| D ISCUSS I ON
The present study provides new insights into the ecology of un- apes. Our work relied on the sampling of fecal specimens and our collection procedures was chosen in a way that preserved RNA and DNA in stool and prevented nucleic acid degradation. As a result, we succeeded in genotyping individuals using amplified microsatellite data and we also succeeded in amplifying and sequencing SIVgor genetic data from some of the seropositive individuals. Overall, we validated the usefulness of noninvasive sampling to study wild primate populations and infectious diseases.
The groups proposed by our study did not exhibit any clear spatial distribution at the current geographical scale. Some groups were only observed once or only during a couple of years. The infrequent sampling of some groups may be the result of sampling efforts: our field teams prioritized sites where gorillas were already spotted in the past. Groups of western lowland gorillas have an annual home range of 7-14 km 2 (Bermejo, 2004;Yamagiwa et al., 2003), and it is possible that some groups roamed to other sites of the park in the following years. Females predominated in most groups but there were several instances where more than one male was present in other groups. Multi-male and all-male groups are suggested to be rare in western lowland gorillas, when compared to mountain gorillas, where male emigration is lower (Davenport, 2008;Gatti et al., 2004;Parnell, 2002). Multi-male groups are the outcome of maturing male offspring that do not emigrate out of the group but in our case we cannot discard the possibility of these males being infants or juveniles; in most cases, the males were related. Western lowland gorillas can also be found in nonbreeding, all-male groups that can last for several years, although they are subject to frequent membership changes due to male migrations (Gatti et al., 2004).
Gorillas are among the relatively few mammal species for which both sexes usually disperse from their natal groups (Stoinski, Perdue, & Legg, 2009;Stokes, Parnell, & Olejniczak, 2003). Understanding wild primate group dynamics is critical to anticipate the demographic consequences of human activities, to measure the impact of disease outbreaks and to identify other drivers of demographical change. Interactions between lowland gorillas groups occur more frequently than they do in mountain gorillas and are often nonaggressive (Bradley, Doran-Sheehy, Lukas, Boesch, & Vigilant, 2004;Doran-Sheehy, Greer, Mongo, & Schwindt, 2004). Bradley et al. (2004) underscored that such nonaggressive behaviour is attributed to male's kinship networks across groups (related males leading groups that range in the same area) and that this kind of encounters facilitates high rates of migration of females.
Although in this study we did not find a clear male's kinship networks across groups, we provided additional evidence of migration based on SIV linage associations. We evidenced the same SIVgor strain circulating in different groups and correspondingly, different lineages within the same group. The lack of a distinctive viral lineage for every group underscores that there is not major viral transmission barrier for the spread of SIV and that pathogens can rapidly spread between groups.
Simian immunodeficiency virus vertical transmission (from mother to infant) is argued to be less frequent than horizontal transmission (between individuals, through wounds, and sexual contact).
The junction of our genetic and phylogenetic findings provides evidence of SIVgor transmission among close relatives in two different sets of gorillas. Closely related viruses have being observed in a presumed parent-offspring pair of chimpanzees suggesting vertical transmission (Keele et al., 2009). Moreover, correlation between SIV infection and maternal kinship that does not involve vertical transmission has been documented in Mandrills (through allogrooming, wound care, or the inheritance of genetic determinants of susceptibility to SIV among others) (Fouchet et al., 2012) and in one infant chimpanzee through breast milk transmission (Keele et al., 2009). In any of the two sets of gorillas with probable vertical transmission, we observed that the individuals who were sampled last were also observed only once but we do not have enough evidence to rule out that this is due to increased mortality rather than to chance. In the same vein, it is generally considered that SIVs do not generally cause acquired immunodeficiency syndrome in their natural hosts but Pandrea, Silvestri, and Apetrei (2009)  we ranked competing models and used the best fit to the empirical data the latter do not necessarily reflect the overall population dynamics.
Our study has limitations. Our sampling relied on the opportunistic collection of samples and therefore we cannot exclude that some regions of the Campo-Ma'an National Park and some groups were missed. Although individuals from both sexes frequently disperse when reaching sexual maturity (Stoinski et al., 2009;Stokes et al., 2003) with not quantitative sex-bias in migration (Funfstuck et al., 2014), our clustering algorithm does not consider migration of individuals between groups because this requires the frequent recapture of the same individuals and a more thorough sampling. Previous studies showed that the mean distance between individuals in zoohoused lowland gorilla groups was in average 23 m while wild mountain gorillas usually stay within 10 m of other members (Kurtycz, Shender, & Ross, 2014;Juichi Yamagiwa, 1983). Moreover, we documented four cases of one individual being sampled twice on the same day and the averaged distance between paired samples was 200 m. Consequently, our clustering distance values were chosen to reflect this but we cannot exclude that the largest groups may represent multiple groups that frequently exchanged individuals, that merged, or whose action range overlapped prior to the time of sampling (the home range of gorillas extensively overlap with those of neighboring groups and they have a large mean day journey length, between 1,100 and 2,600 m (Yamagiwa et al., 2003). However, in addition to using multiple distance cutoffs, we also removed observations in sensitivity analyses and the resulting groups represent the best division in terms of occurrence and field research. Finally, we were not able to distinguish between adults and nonadults, to genotype all samples or to amplify SIVgor from all seropositive samples.
Although we have characterized nearly complete SIV genomes from some fecal samples (D'Arc et al., 2015) we succeeded more often with smaller genetic fragments.
Despite all the aforementioned issues, our results validate a noninvasive approach to the recovery of otherwise unobtainable information directly related to the ecology of gorillas and of infectious diseases in the wilderness. In the same vein, this approach could allow to survey the microbiome and the virome using metagenomics.
For example, D'arc et al. provided evidence of association of specific mammalian viral families and SIVgor in a dysbiosis context (D'Arc et al., 2018). Thus, these studies provide a baseline to identify markers associated with pathogenic conditions.
In conclusion, given the critically endangered status of western lowland gorillas, a noninvasive sampling represents a viable approach to study wild population and disease and the findings derived from it contribute to the prospective planning for better monitoring and conservation and also contribute to a better understanding of SIVgor evolution and transmission.

ACK N OWLED G M ENTS
This work was supported in part by grants from the National Institute

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R CO NTR I B UTI O N S
CJVA designed and performed data analysis, interpreted the data and wrote the manuscript; AA, AE, MD performed laboratory and data analysis; EMN coordinated field studies; MP designed analysis, interpreted the data, and wrote the manuscript; all authors discussed the results and implications.

DATA ACCE SS I B I LIT Y
New sequences were deposited in GenBank under accessions MH646552-MH646620. Microsatellite genotypes are available as Supporting Information Table S2.