Gene flow between island populations of the malaria mosquito, Anopheles hinesorum, may have contributed to the spread of divergent host preference phenotypes

Abstract Anopheles hinesorum is a mosquito species with variable host preference. Throughout New Guinea and northern Australia, An. hinesorum feeds on humans (it is opportunistically anthropophagic) while in the south‐west Pacific's Solomon Archipelago, the species is abundant but has rarely been found biting humans (it is exclusively zoophagic in most populations). There are at least two divergent zoophagic (nonhuman biting) mitochondrial lineages of An. hinesorum in the Solomon Archipelago representing two independent dispersals. Since zoophagy is a derived (nonancestral) trait in this species, this leads to the question: has zoophagy evolved independently in these two populations? Or conversely: has nuclear gene flow or connectivity resulted in the transfer of zoophagy? Although we cannot conclusively answer this, we find close nuclear relationships between Solomon Archipelago populations indicating that recent nuclear gene flow has occurred between zoophagic populations from the divergent mitochondrial lineages. Recent work on isolated islands of the Western Province (Solomon Archipelago) has also revealed an anomalous, anthropophagic island population of An. hinesorum. We find a common shared mitochondrial haplotype between this Solomon Island population and another anthropophagic population from New Guinea. This finding suggests that there has been recent migration from New Guinea into the only known anthropophagic population from the Solomon Islands. Although currently localized to a few islands in the Western Province of the Solomon Archipelago, if anthropophagy presents a selective advantage, we may see An. hinesorum emerge as a new malaria vector in a region that is now working on malaria elimination.

In this study, we develop basic population genetic knowledge in a malaria-transmitting species-Anopheles hinesorum-previously An. farauti 2, belonging to the Anopheles farauti complex. Although phylogeographic, behavioural and ecological studies have been performed on many members of this species complex (Ambrose et al., 2012;Beebe et al., 2000Beebe et al., , 2015Van Den Hurk et al., 2000), it is as yet unstudied in terms of the basis of anthropophagy. The An. farauti complex is a particularly useful study system for elucidating the molecular basis of human host preference in mosquitoes due to differences in its host preference in geographically isolated populations and species (Beebe et al., 2015). Anopheles hinesorum is possibly the most useful species in the complex for studying the anthropophagy due to intraspecific differences in host preference.
Anopheles hinesorum has a wide distribution through much of the south-west Pacific (Australia, New Guinea and the Solomon Archipelago) being found in coastal and inland habitats up to of over 1000 m above sea level Beebe et al., 2015).
Throughout most of its range, An. hinesorum is a host generalist, being opportunistically anthropophagic (Cooper et al., 2009;Keven et al., 2017;Laurent et al., 2017;Sweeney et al., 1990). However, most populations from the Solomon Archipelago do not bite humans (they are exclusively zoophagic; Beebe et al., 2000;Foley et al., 1994). This is a well-established phenotypic difference, and recent fieldwork (2015 and 2018) in Guadalcanal in Solomon Islands has further verified this finding where no An. hinesorum were collected in human landing catches (HLCs). These HLCs were performed near (within 50 m of) productive larval sites, no one has yet been able to collect blood fed adults from these populations, and their hosts remain unknown. A previous study also showed that exclusive zoophagy in An. hinesorum is a derived trait, finding two distinct zoophagic mitochondrial (mtDNA) lineages (Ambrose et al., 2012). Ambrose et al. (2012) hypothesized that the evolution of exclusive zoophagy in these lineages may have occurred independently by convergent evolution. They found that the two lineages likely represent two separate dispersal events colonizing the Archipelago at different times in the past with the northern lineage representing an older dispersal event and the southern lineage representing a more recent dispersal event. In contrast to the hypothesis of convergent evolution of zoophagy, it is also possible that the initial (older) colonizing lineage had already adapted to feeding on local island hosts and that zoophagy was transferred from this preadapted population to the secondary (younger) colonizers via gene flow. Another possible scenario is that An. hinesorum colonized all islands in the Solomon Archipelago shortly after arriving there and that the secondary dispersal event to the southern islands resulted in the introduction and spread (via selective sweep) of a new mitochondrial lineage. Finally, it is possible that the initial population on the islands (presumably colonists from New Guinea or Australia) evolved or already exhibited zoophagy and contained multiple mitochondrial lineages which subsequently became dominant in the north and the south of the Archipelago.
As mentioned above, most populations of An. hinesorum in the Solomon Archipelago are exclusively zoophagic, including populations from Bougainville and Guadalcanal Foley et al., 1994). However, a recent study revealed anthropophagy in the Western Province Solomon Islands, where adult female An. hinesorum were collected landing (i.e. attempting to feed) on humans (Burkot et al., 2018). A few samples of the species have been collected on one other occasion in human landing catches on Santa Isabel, another island of the Solomon Archipelago, where it is very common in larval collections (Bugoro et al., 2011). Taken together, these studies show that there are behavioural differences in host preference between populations of this species within the Solomon Archipelago. The recently discovered anthropophagic population may have emerged as the result of the re-evolution of anthropophagy from a zoophagic population. Alternatively, it may have been spread via gene flow from anthropophagic population(s) in Australia or New Guinea.
In this study, our first aim is to complement and build on previously published work with new nuclear microsatellite and mitochondrial data to better understand the population structure of An. hinesorum. This will lay the groundwork for its development as a novel model system for studying human host preference in mosquitoes. Our second aim is to use nuclear data to assess whether gene flow may have contributed to the spread of zoophagy between northern and southern island populations. Our third aim is to evaluate whether there is any evidence of gene flow from mainland Australia or New Guinea into the newly discovered anthropophagic An. hinesorum population in the Archipelago. To achieve these aims, we build on mitochondrial data (n = 233) published in Ambrose et al. (2012) to include additional Solomon Archipelago populations (n = 61). We develop 14 novel microsatellite primers for the species and generate microsatellite data from throughout the species range (n = 456). We include mitochondrial and nuclear microsatellite data from samples collected in human landing catches by Burkot et al. (2018), from the anthropophagic Western Province Solomon Islands population.

| Sampling and species identification
Specimens for this study were collected as both larvae and adults, with some samples collected in human landing catches (Table 1 and Figure 1). Genomic DNA was isolated, and samples were verified as being An. hinesorum using a well-established PCR diagnostic method (Beebe & Saul, 1995).

| Mitochondrial sequencing and analysis
We sequenced, edited and aligned a 527 base-pair sequence of the mitochondrial cytochrome oxidase 1 gene (mtDNA COI) for 60 individuals in this study. We aligned this with previously published homologous sequence data (n = 206; Ambrose et al., 2012) for further analysis. The new data include 27 individuals from Santa Isabel Island (including eight adult females caught in human landing catches in a previous study; Bugoro et al., 2011), 26 samples from the Western Province of the Solomon Islands (see Table 1), five individuals (larvae) from the Nggela Islands and two additional individuals (larvae) from Bougainville Island. Of the 26 individuals collected from Solomon Islands Western Province, 16 were adult females (collected biting humans) and 10 were collected as larvae. We generated data for this study using the same primers and methods outlined in Ambrose et al. (2012) and then edited and realigned them to the preexisting COI alignment in the program Geneious v.8 (Kearse et al., 2012). To assess relationships between populations, we generated a median joining mitochondrial haplotype network using the program PopART (Bandelt et al., 1999).

| Microsatellite development and scoring
We developed 14 novel microsatellite markers using the same methods described in Ambrose et al. (2014). We called fragment sizes manually using the program, GeneMarker v.2.2 (Hulce et al., 2011) and removed individuals missing data from six or more loci from the data set prior to analysis, leaving 456 individuals in the final data set. We initially defined populations based on genetically distinct groups identified by Ambrose et al. (2012), and we treated separate islands in the Solomon Archipelago as populations. We then checked for the presence of null alleles using MicroChecker v2.2.3 (Van Oosterhout et al., 2004) and for Hardy-Weinberg equilibrium (HWE) in the R package PopGenReport (Adamack & Gruber, 2014). For primer-and locus-specific information, including information on null alleles and HWE, see Table S1.

| Microsatellite population structure
We performed a variety of analyses to assess population structure of An. hinesorum throughout its range based on nuclear microsatellite data. These include Bayesian analyses (STRUCTURE), multivariate analyses, estimation of fixation indices, a neighbour-joining tree based on pairwise G′ ST and AMOVA. Initially, we assessed population structure using the Bayesian clustering program STRUCTURE v. 2.3.4 (Pritchard et al., 2000). We ran STRUCTURE through the program STRUCTURE_threader (Pina-Martins et al., 2017) for 20 iterations of K = 2 to K = 15, using the admixture model and location priors (100,000 generation burn-in, 500,000 generation sampling).
Sites where mosquitoes were sampled were used to define location priors for populations from Australia and New Guinea. In the Solomon Archipelago, we used the islands that individuals were sampled from as location priors. We ran STRUCTURE output through the CLUMPAK server (Kopelman et al., 2015), with default CLUMPP (Jakobsson & Rosenberg, 2007) and DISTRUCT (Rosenberg, 2004) settings, including the LargeKGreedy algorithm (in CLUMPP), with a random order of input and 2000 repeats. We determined the most strongly defined population structure in the data using CLUMPAK which implements the Evanno delta K method (Evanno et al., 2005) as well the most probable K based on the 'Estimated Ln Prob of Data' (Kopelman et al., 2015). As has been found previously, the Evanno method underestimated the optimal value of K (Janes et al., 2017).
We therefore present the major mode for STRUCTURE plots for both K = 2 (predicted by the Evanno method) and K = 10 (predicted by the 'Estimated Ln Prob of Data'). Additional STRUCTURE plots for all K values run can be found in Data S1.
We also used three multivariate clustering methods-principal components analysis (PCA), discriminant analysis of principal components (DAPC; Jombart et al., 2010) and t-distributed stochastic neighbour embedding (t-SNE; Van Der Maaten & Hinton, 2008)-to assess population structure. T-distributed Stochastic Neighbour Embedding is a multivariate method based on machine learning that is used to visualise multidimensional data in two or three dimensions. It is similar in concept to principal component analysis in that it arranges points (representing individuals) in space such that highly similar points are located close together (clustered) while dissimilar points are dispersed (Van Der Maaten & Hinton, 2008). An advantage of these multivariate approaches is that they are free of population genetic assumptions; for example, there is no assumption that populations are in HWE. We performed both PCA and DAPC analyses in the adegenet package (Jombart, 2008) and the t-SNE analysis in the Rtsne package (Krijthe, 2015) in R version 3.3.0 (R Core Team, 2013), run through RStudio version 1.0.136 (Rstudio Team, 2020). For these analyses, we replaced missing data with mean values for the overall data. TA B L E 1 Summary of Anopheles hinesorum collections and of the number of individuals genotyped for nuclear microsatellites and mtDNA COI validation xvalDAPC command with 1000 replicates to determine the optimal number of principle components (PCs) to retain for each analysis. For the full data set, we retained 80 PCs and ten discriminant axes (DAs), and for the Solomon Archipelago data alone (n = 177), we retained 30 PCs and 4 DAs. We present two types of plots that were generated from the DAPC: a composition plot (a bar plot-similar to a STRUCTURE plot) and pairwise plots of the first two discriminant axes.
We used the program GenAlEx v.6.5 (Peakall & Smouse, 2006) to estimate pairwise fixation indices, G ST , G′ ST and Jost's D, between the populations identified by STRUCTURE and multivariate methods. We plotted results for one of these indices (G′ ST ) in tabular form, as well as building a neighbour-joining tree based on pairwise G′ ST using the R package ape (Paradis & Schliep, 2019). Finally, we performed an AMOVA (Excoffier et al., 1992) to partition variance explained by different hierarchical strata in the data. To achieve this, we used the poppr.amova function implemented in the R package poppr (Kamvar et al., 2014). Prior to running the AMOVA, we defined strata within our data by region (Australia, New Guinea, Solomon Archipelago) as well as by populations identified by STRUCTURE and multivariate analyses. We then performed a randomized test using the randtest function to assess whether there is significantly more or less variance explained by different partitions (strata) in the data compared with the null (random) expectation.

| Mitochondrial DNA genetic structure
The mtDNA haplotype network ( Figure 2) expands on previously published work (Ambrose et al., 2012), with the addition of sam-

| Microsatellite analyses
All microsatellite analyses identified strong genetic structure between populations defined a priori. We find support for all previous genetic groups found by Ambrose et al. (2012) with a high probability of assignment to a single cluster for most individuals in both DAPC and STRUCTURE analyses (Figures 1 and 2  compared with the null model ( Figure 3 and Table 2).
The Northern Territory (Australia) population is the most distantly related at the nuclear level based on consistently high pairwise shows observed (black line) variance (Sigma) versus the null expected distribution, which was generated by randomisation of data in R, as outlined in Excoffier et al. (1992). The top left panel shows that the observed Sigma within samples is lower than expected under the null distribution (p < 0.01). The top right panel shows that Sigma observed between samples within populations is higher than expected (p < 0.01). The bottom left panel shows that Sigma observed between populations within regions is higher than expected (p < 0.01). The bottom right panel shows that Sigma observed between regions higher than expected (p < 0.01) from the northern New Guinean population were previously found to be completely sorted from other populations of the species, forming a well-supported monophyletic clade (Ambrose et al., 2012). The northern New Guinean population was also the only An. hinesorum population that could not be detected by species-specific genomic DNA probes (Beebe et al., 1996). Altogether, this evidence suggests In particular, what is now northern Queensland was connected to southern New Guinea for more than 90 per cent of the last 250,000 years, while the Northern Territory was only connected to New Guinea for <10 per cent of this period (Voris, 2000). Following the most recent glacial maximum, the Northern Territory separated from New Guinea approximately 12,000 years bp and Queensland separated from southern new Guinea as recently as 7000 years bp (Lambeck & Nakada, 1990;Nix & Kalma, 1975). Close relationships in the mtDNA haplotype network reflect these recent connections, and it is likely that the populations from Queensland and southern New Guinea formed a large metapopulation encompassing this area during the Pleistocene. Nuclear microsatellites support this hypothesis, as the Queensland and southern New Guinean populations are closely related for these markers.

Sums of squares
The Northern Territory population is the most genetically distant of any population in microsatellite analyses as observed through the pairwise fixation indices. This may be explained by the reduced period of time that the Northern Territory was connected to New Guinea during the Pleistocene. Although there were land bridges connecting the Northern Territory to New Guinea relatively recently, the climate during glacial maxima in large parts of Australia was much drier than it is today (Williams et al., 2009). This means that that there would have been little opportunity for connectivity between the Northern Territory and other populations in Queensland and New Guinea, even when New Guinea was connected to the Northern Territory directly. Today An. hinesorum in the Northern Territory could be a remnant population with a restricted distribution . Additionally, the monsoonal climate in the Northern Territory drives intense dry periods, likely causing this population to go through regular bottlenecks, allowing greater potential for genetic drift to occur. These climatic and biogeographic factors working on a small, isolated population may explain why the Northern Territory population appears so distinct for the microsatellite markers used in this study.

| Evolution of exclusive zoophagy in the Solomon Archipelago
The Anopheles farauti complex shows variation in human host preference. Zoophagy is a derived trait in this complex that has evolved at least twice in the Solomon Archipelago: once in An. irenicus (another exclusively zoophagic species in the An. farauti group) and at least once in An. hinesorum (Ambrose et al., 2012;Beebe et al., 2000;Foley et al., 1994). This group therefore provides a useful system to study the genetic basis of human host preference in mosquitoes. Specialization in this group has occurred in the opposite direction to that in An. gambiae s.s. and Ae. aegypti, with species in the An. farauti complex having evolved from anthropophagic generalists to exclusively zoophagic species (and populations; Ambrose et al., 2012). The An. farauti complex therefore provides a useful counterpoint for comparison to other well-studied mosquito systems.
Human landing catches performed during studies in Bougainville and Guadalcanal have failed to collect An. hinesorum despite productive larval habitats near the human landing catches Foley et al., 1994). Thus, the zoophagic trait appears to be fixed in An. hinesorum populations from the northern and southern Solomon Archipelago. The hosts that these populations are feeding upon remain unknown but there was probably a limited range of hosts available on the Solomon Archipelago at the time of initial colonization, and sizeable mammals may not have been present (Ambrose et al., 2012). Anopheles hinesorum in Australia and New Guinea are attracted to carbon dioxide baited traps while populations in the Solomon Archipelago are not Cooper et al., 2009;Foley et al., 1994;Van Den Hurk et al., 1997), a phenotypic difference indicating that the colonization of these islands may have driven the adaptation of An. hinesorum to ectothermic hosts. Other animals, including insects, have experienced host shifts and specialization when colonizing islands (Jorge et al., 2018;Simberloff, 1974;Tseng et al., 2018;Yassin et al., 2016), and the Solomon Archipelago supports an abundant and diverse frog and reptile fauna (Morrison et al., 2007;Pikacha et al., 2016), providing an plentiful potential food source.
Mitochondrial Genetic connectivity between islands in the Solomon Archipelago is also indicated by the microsatellite data. STRUCTURE plots and DAPC compoplots show mixed assignment of individuals from different islands in the Solomon Archipelago. Comparatively low fixation indices also indicate that recent gene flow has occurred. This result makes sense given that most islands of the Solomon Archipelago were connected by land bridges and formed a larger island known as Greater Bukida, separated from Guadalcanal by only 2 km of ocean at times of lowest sea level (Mayr & Diamond, 2001

Recent fieldwork in Solomon Islands Western Province identified
An. hinesorum feeding on humans (Burkot et al., 2018). This is an important finding as the species has rarely been collected feeding on humans in the Solomon Archipelago during many previous attempts to collect An. hinesorum in human landing catches, despite abundant larvae in the immediate landscape. The mtDNA (COI) of samples from this population fell into three genetically distinct and geographically defined groups: one from New Guinea (Papuan Peninsula) and two from the Solomon Archipelago (northern and southern lineages).
One Western Province haplotype sampled in eleven individuals is identical to a sequence sampled from an anthropophagic popula-  Figure S3). At this time, anthropophagic An. hinesorum were abundant at these New Guinean airbase sites   Even though a commonly sampled mitochondrial haplotype in the Solomon Islands Western Province is identical to a haplotype sampled in a New Guinean population-the only case of haplotype sharing between New Guinean and Solomon Archipelago populations-the nuclear microsatellites suggest the genomes of the Western Province population appear to be mostly of native Solomon Archipelago origin. This is shown by the close relationships at microsatellite loci between individuals from Western Province and the rest of the Solomon Archipelago. In mosquitoes, olfaction is the primary sense that governs host preference (Takken, 1991), and it is likely that only small regions (e.g. a small number of olfactory genes) of the genome are associated with the ability to detect and feed on humans (Raji & DeGennaro, 2017  Today, the only common species known to transmit malaria in the Solomon Archipelago is the coastally restricted An. farauti (Beebe et al., 2015). The emergence of anthropophagy in a population of An. hinesorum from the Solomon Archipelago has serious implications for the transmission of malaria, especially if this phenotype spreads through other populations in the Archipelago.

Anopheles hinesorum is common and abundant through the Solomon
Islands showing larval site plasticity, and existing both inland and at elevation (Beebe et al., 2015). If anthropophagy provides a selective advantage (i.e. blood source availability and improved fecundity), it may spread quickly resulting in the emergence of a second common malaria vector in the Solomon Archipelago. This could have serious implications for the spread of malaria in the Solomon Archipelago due to the high abundance of An. hinesorum through the Solomon Islands.

| CON CLUS IONS
In this study, we have achieved a more complete understanding of population genetic relationships of An. hinesorum in the Western Pacific, clarifying population subdivisions. This lays the groundwork necessary to use this species as a novel model system for studying human host preference in mosquitoes. Large mtDNA divergences likely do not indicate species boundaries, as nuclear gene flow is evident between some highly diverged lineages in the Solomon Archipelago. Although we cannot be certain that exclusive zoophagy in the Solomon Island populations was transmitted between these divergent lineages via gene flow, our results suggest that gene flow between islands of the Archipelago has occurred. Further work is necessary to disentangle the hypotheses regarding the origins of zoophagy in the Solomon Archipelago. We detected New Guinean mitotypes in a recently discovered anthropophagic population from the Solomon Islands indicating that human-mediated transport of the species may have resulted in anthropophagy being introduced to the Archipelago. The emergence of this phenotype may have ramifications for the epidemiology and transmission of malaria on the Solomon Archipelago; specifically, it may result in increased malaria transmission in inland villages.

CO N FLI C T O F I NTE R E S T
The authors declare no conflicts of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are openly available