• Open Access

No recent adaptive selection on the apyrase of Mediterranean Phlebotomus: implications for using salivary peptides to vaccinate against canine leishmaniasis


Shazia S. Mahamdallie, Department of Entomology, Natural History Museum, Cromwell Road, London SW7 5BD, UK.
Tel.: +44(0)2079425622;
fax: +44(0)2079425229;
e-mail: ssmaham@gmail.com


Vaccine development is informed by a knowledge of genetic variation among antigen alleles, especially the distribution of positive and balancing selection in populations and species. A combined approach using population genetic and phylogenetic methods to detect selective signatures can therefore be informative for identifying vaccine candidates. Parasitic Leishmania species cause the disease leishmaniasis in humans and mammalian reservoir hosts after inoculation by female phlebotomine sandflies. Like other arthropod vectors of disease agents, sandflies use salivary peptides to counteract host haemostatic and immunomodulatory responses during bloodfeeding, and these peptides are vaccine candidates because they can protect against Leishmania infection. We detected no contemporary adaptive selection on one salivary peptide, apyrase, in 20 populations of Phlebotomus ariasi, a European vector of Leishmania infantum. Maximum likelihood branch models on a gene phylogeny showed apyrase to be a single copy in P. ariasi but an ancient duplication event associated with temporary positive selection was observed in its sister group, which contains most Mediterranean vectors of L. infantum. The absence of contemporary adaptive selection on the apyrase of P. ariasi may result from this sandfly’s opportunistic feeding behaviour. Our study illustrates how the molecular population genetics of arthropods can help investigate the potential of salivary peptides for disease control and for understanding geographical variation in vector competence.


Vector-borne diseases, often transmitted by insects, are responsible for 17% of the global burden of parasitic and infectious diseases (WHO 2008). With the (re-)emergence of some vector-borne diseases, such as leishmaniasis, vaccine development constitutes a primary tool in the fight against transmission (WHO 2011). Vaccine candidates include insect salivary peptides (Collin et al. 2009) as well as parasite antigens that elicit a mammalian–host immune response (Evans and Kedzierski 2011). Prioritization and potential efficacy can be informed by population genetic studies of targeted antigens in natural parasite populations. For example, population genetic-based selection statistics were informative for the identification of Plasmodium molecules as malaria vaccine targets (Mu et al. 2007). The majority of candidate antigens biologically function through host-acquired immunity and are associated with, and therefore identified by, signatures of balancing selection, for example adaptive evolutionary maintenance of genetic variation by favouring intraspecific low- to medium-frequency haplotypes of Plasmodium (Conway and Polley 2002; Ochola et al. 2010; Weedall and Conway 2010). Furthermore, a population genetic approach has been used to circumvent the difficulties met when designing geographically broad-range vaccines based on polymorphic parasite antigens (Duan et al. 2008; Barry et al. 2009). Practically, however, the number of variants that can constitute a vaccine is limited, often being significantly fewer than the pathogen haplotypes reported from a single geographical location, and this can compromise vaccine efficacy (Takala et al. 2009). Therefore, vaccine candidates showing low levels of intraspecific polymorphism might be prioritized, for example those under purifying selection resulting from functional constraint (Doi et al. 2011), as long as population haplotype and associated amino acid diversity and distribution are well characterized (Barry et al. 2009).

The parasite Leishmania infantum Nicolle, 1908 (Euglenozoa, Trypanosomatidae) is the causative agent of zoonotic visceral leishmaniasis (ZVL), a neglected tropical disease that is sometimes fatal to humans in the Mediterranean region, Asia and Latin America (WHO 2004; Ready 2010). Leishmania infantum survives in a mammalian host–sandfly vector cycle, with the domestic dog considered to be the main reservoir of infection (Quinnell and Courtenay 2009), and therefore control by vaccination has both medical and veterinary relevance.

Evolutionary adaptations that counteract or manipulate host immune responses have been proposed for sandflies (Belkaid et al. 2000) and Leishmania (Cunningham 2002), as have undefined coevolutionary arms races between sandfly and parasite (Beverley and Dobson 2004). Among the molecules implicated are sandfly salivary peptides. These are pharmacological agents pumped, along with Leishmania parasites, into feeding pools in mammalian skin by bloodfeeding adult female sandflies (Ribeiro and Franscischetti 2003). Some of these salivary peptides counteract host haemostatic and immunomodulatory responses to capillary laceration or provide protective immunity to Leishmania in experimental systems, which has stimulated research into their use as second-generation vaccine candidates, with or without parasite antigens (Evans and Kedzierski 2011).

In experimental models, some of the diverse salivary peptides help to control Leishmania infections [Th1-type cell-mediated immunity (CMI)] or exacerbate them (Th2-type CMI), partly depending on the mammalian species and the history of exposure to saliva (Collin et al. 2009). Consequently, specific salivary peptides have been used in two ways for the experimental vaccination of mice, providing protection against Old World Leishmania major by stimulating either a cellular response [i.e. a delayed-type hypersensitivity (DTH)] that modifies the bite site in a way harmful to the parasite and/or its establishment, as shown for PpSP15 from the natural vector Phlebotomus papatasi (Valenzuela et al. 2001a; Oliveira et al. 2008), or a humoral response that neutralizes the exacerbation of the infection, as demonstrated for maxadilan from the neotropical sandfly Lutzomyia longipalpis (Morris et al. 2001).

Similar to other immunity genes, sandfly salivary peptides do appear to be evolving at different rates, because some have been reported to be specific to genera, subgenera and even species (Anderson et al. 2006), and show population-level antigenic polymorphism (Milleron et al. 2004). Insights provided by metapopulation analyses of the genetic diversity of vaccine candidates (Barry et al. 2009) inform us that the choice of salivary peptide for effective vaccination will partly depend on its natural polymorphism among geographical populations of ectoparasitic sandfly species, which will be governed by its rate of change in response to selection pressures from regional populations of both hosts and parasites. Any adaptive selection of sandfly salivary peptides arising from interactions with hosts or parasites will have practical implications for a vaccination programme because of the contrasting effects of selection type. Protective immunity against Leishmania has been reported to differ for the saliva of wild and colonized P. papatasi, but the conclusion that this results from colony sandflies experiencing inbreeding and artificial selection pressures (Ahmed et al. 2010) should be verified by experimental replication.

In the current report, we characterize the adaptive evolution of one of the sandfly salivary peptides that are vaccine candidates (Valenzuela et al. 2001a; Oliveira et al. 2006) in the Mediterranean region, where the transmission of L. infantum by five closely related phlebotomine sandflies (Diptera, Psychodidae) of the subgenus Larroussius (Ready 2008, 2010) suggests the possible use of a single vaccine for disease control. We have used as a model the natural variation of apyrase (E.C. in Phlebotomus (Larroussius) ariasi Tonnoir, 1921, one of the two vectors of L. infantum in southwest Europe (Ready 2010), and other related Phlebotomus. Sandfly apyrase is a potent antiplatelet haemostatic factor (Valenzuela et al. 2001b; Hamasaki et al. 2009), which we chose to target for several reasons: it is a vaccine candidate, because a recombinant expressing the apyrase of P. ariasi produced an appropriate protective-like cellular DTH response in a mouse model (Oliveira et al. 2006); functional aspects of the structure of this Cimex family apyrase (Valenzuela et al. 2001b), including putative MHC epitopes (Kato et al. 2006), can be surmised because of characterization by X-ray crystallography (Dai et al. 2004) and site-directed mutagenesis (Yang and Kirley 2004) of the human homologue; and as a practical advantage, we expected to be able to design conserved primers for the routine PCR amplification and direct sequencing of the apyrase genes of sandflies, because they were unlikely to occur in large and highly variable gene families (Anderson et al. 2006). The moderate levels of polymorphism reported might reflect the failure of sandfly apyrase to induce a humoral immune response, although this has been demonstrated only in one P. ariasi–mouse model (Oliveira et al. 2006).

Here, we report a thorough investigation of adaptive selection on the apyrases of wild sandflies, using a combination of phylogenetic and population genetic approaches. For the latter, 20 spatio-temporal populations of P. ariasi were characterized, mainly from areas endemic for ZVL, because selection can vary across subpopulations and can be strong enough for detection by some tests in only 15–50% of populations (Garrigan and Hedrick 2003; Spurgin and Richardson 2010). Practically, the detection of positive or balancing selection requires the use of a variety of tests but, unfortunately, those sensitive to contemporary or recent selection can be confounded by neutral demographic variation (Ramírez-Soriano et al. 2008). Therefore, we targeted the vector P. ariasi, for which demographic variation in apyrase could be recognized by reference to neutral markers with well-supported molecular phylogenies (Mahamdallie et al. 2011). The types and uses of population genetic tests to detect signatures of selection have been reviewed elsewhere (Nielsen 2005; Zhai et al. 2009). Our conclusions take into account the power of individual tests, which can depend on the mode of selection, duration of the selective signal, mutation position, recombination, divergence time and other factors.

Materials and methods

Sampling and molecular biology techniques

Mahamdallie et al. (2011) reported our methods for sandfly sampling and identification, genomic DNA extraction, DNA amplification by PCR and cycle sequencing. Figure 1 and Table S1 give location information for the 20 spatio-temporal populations of P. ariasi and other sandflies molecularly characterized. Table 1 gives details of new primers and PCRs developed to amplify apyrase fragments for direct cycle sequencing. Conserved forward primer APY-1F and reverse primer APY-3R were designed to amplify apyrase amino acids 26–213 out of the total approximately 336 (Anderson et al. 2006) from P. ariasi and other Phlebotomus species, with larger fragments not being susceptible to routine amplification and direct sequencing. However, the conserved fragment had to be cloned for four species with duplicate loci: 10–15 clones were sequenced from each species’ library, built using DNA from two specimens and the TOPO TA cloning© kit (Invitrogen™, Renfrew, UK). Excluding primers, the fragment length was 514 base pairs (bp) for species with duplicate loci or 520 bp for single-copy genes. For P. ariasi, eight allele-specific primers were designed to discriminate genotypes using the PCR amplification of specific alleles technique (Mahamdallie et al. 2011). MACVECTOR v11.0 (MacVector Inc., Cary, NC, USA) was used to identify any changes in the secondary structure of the deduced proteins.

Figure 1.

 Locations of collection sites for Phlebotomus ariasi in north-east Spain and southern France, with ellipses marking the regional populations supported by the Analysis of Molecular Variance (amova) of the genotypes of apyrase and other neutral loci (Mahamdallie et al. 2011) (Population codes in Table 2).

Table 1.   Novel primers and PCR conditions for the amplification and direct sequencing of the apyrase gene fragment of Phlebotomus ariasi and other Phlebotomus.
Primer5′ ntPrimer sequence 5′-3′Primer pairTm (°C)MgCl2 (mm)
  1. 5′ nt: 5′ nucleotide’s position in GenBank accession AY845193. Tm: one-/two-step annealing temperatures.

  2. *Conserved; others allele specific.


Phylogenetic analysis

Phylogenetic relationships among alleles were reconstructed by Bayesian estimation (MrBayes v3.1.2) (Ronquist and Huelsenbeck 2003), with nucleotide substitution models for each codon position being selected using the Akaike Information Criterion approach in MrModeltest v2.3 (Nylander 2004), and following the steps of Mahamdallie et al. (2011). A network of P. ariasi alleles was constructed in TCS v1.21 (Clement et al. 2000) using a 95% parsimony connection limit, to investigate phylogeographic variation.

Testing for selection among and within lineages

The CODEML programme of Phylogenetic Analysis by Maximum Likelihood (PAML v4.2) (Yang 2007) tests whether a branch has an accelerated rate of amino acid replacements against the background rate of the other branches in a phylogeny. The branch lengths of input gene trees were re-estimated under ML (model = 0, NS sites = 0 for nucleotide substitutions per codon) and then used as initial values in further CODEML analyses. Transition/transversion rate ratio (k) was estimated and alpha fixed at a constant rate in the control file. Bayes Empirical Bayes (BEB) (in CODEML) was used to identify amino acids under positive selection in Random-site models.

Selection was also investigated within P. ariasi. Codon usage bias is indicative of selection for preferred protein expression or functional constraint, and so three estimates were obtained using DnaSP v4.90.1 (Rozas et al. 2003). Additionally, selection was investigated using population-based neutrality tests. The McDonald-Kreitman (MK) test was implemented in DnaSP v4.90.1, along with the associated Neutrality Index (NI) to indicate the direction and degree of selection. Their sensitivity relies on the appropriate choice of an out-group (Wayne and Simonsen 1998), which was selected using dS saturation levels (<0.5) estimated by the accurate ML method based on models of codon substitutions (CODEML).

Three other neutrality tests (Arlequin v3.11) (Excoffier et al. 2005) were used to search for skews in allele frequencies, to detect weaker and more recent signals of intraspecific selection. Negative and positive values of Fu & Li’s and Tajima’s D statistics result from directional and balancing selection, respectively. For the Ewens–Watterson (EW) test, negative and positive values arise from a deficiency or an excess of homozygotes and signal balancing or directional selection, respectively. For neutrality tests, the significance of any deviations from neutral expectations was calculated using 10 000–16 000 coalescence simulations, with multiple tests being manually corrected for family-wise type 1 errors by applying a sequential Bonferroni correction (α = 0.05) (Holm 1979).

These statistics can be affected not only by selection but also by neutral demographic processes and recombination. Population expansion (or a selective sweep) is signalled by finding significant deviations from neutral expectations for Fu’s FS statistic (Arlequin v3.11) or the R2 statistic (DnaSP v4.90.1). Only the latter retains power when recombination is high and there are many segregating sites (S) (Ramírez-Soriano et al. 2008). The minimum number of recombination events (Rm) was calculated in DNAsp v4.90.1 using the four gamete model.

Additionally, deviations from exact Hardy–Weinberg equilibrium (HWE) at a single locus were assessed (Arlequin v3.11) to reveal whether selection occurred in the current generation (e.g. an excess of observed homozygotes or heterozygotes resulting from directional or balancing selection, respectively).

Population genetics

FST estimates of genetic distance between population pairs were based on allele frequencies and calculated in FSTAT v2.9.3.2 (Goudet 2002) using the exact test of Weir and Cockerham, which is unaffected by the sampling scheme. One thousand permutations with a sequential Bonferroni correction were used to derive significance levels for multiple comparisons. Association was sought between genetic distance [FST/(1−FST)] and geographical distances between population pairs (Mahamdallie et al. 2011). Regression analysis and the nonparametric significance of any isolation-by-distance were implemented within the ISOLDE suboption of GENEPOP v4.0 (Raymond and Rousset 1995), using 1000 permutations for the Mantel test and no assumptions about the dimension of dispersal.

Analysis of MOlecular VAriance (Arlequin v3.11) was used to evaluate the amount of haplotype diversity correlated with different nested levels of hierarchical population subdivision. Interhaplotype distances were calculated using both haplotype frequencies and pairwise molecular distances.


Apyrase gene structure and phylogenetic lineages in Phlebotomus ariasi and related sandflies

Our multispecies 154-amino acid alignment without introns or indels (Figure S1) contained most of the known functional sites of calcium-activated apyrases (CANs), including residues binding nucleotides (eight of 13 sites) or calcium (three of six) (Dai et al. 2004; Yang and Kirley 2004), those that increased ADPase activity following in vitro mutagenesis (eight of 10) (Yang and Kirley 2004) and blocks predicted to be MHC epitopes (four of six) (Kato et al. 2006). For all four types of site, Phlebotomus residues were either constant or varied phylogenetically among species and duplicate loci. None was polymorphic in P. ariasi. The secondary structure changes predicted by the Chou–Fasman and/or Robson–Garnier models usually involved only the loss of a single beta-sheet, which was also true for the conservative residue replacements at the two polymorphic sites (codons 32, 152) in P. ariasi.

A robust phylogeny of apyrase genes was required for selection tests: as input in maximum likelihood (PAML) branch models, to ensure that orthologous sequences from monophyletic species were being compared in all other tests and to identify an out-group for the MK and Fu & Li’s D tests. An initial Bayesian phylogeny included GenBank sequences from the subgenera Phlebotomus and Euphlebotomus, but the absence of congruence with other gene trees (Esseghir et al. 2000; Mahamdallie et al. 2011) indicated the inappropriateness of these distant out-groups. Strong support was expected and found (Fig. 2) for treating the apyrases of species of the subgenera Transphlebotomus and Adlerius as out-groups for those of the subgenus Larroussius. The apyrase sequences of P. ariasi were monophyletic in the Bayesian phylogeny and the genealogical network (Figure S2), and neither analysis supported the presence of phylogenetic species or gene duplication events in this taxon.

Figure 2.

 Bayesian reconstruction of the phylogeny of the 462-nucleotide alignment of apyrase, with the solid ellipse marking a gene duplication event and including all alleles of each Phlebotomus species except P. ariasi (set pruned of APY alleles >1 step from the modal alleles in Figure S2). Posterior probabilities >0.7 indicate statistically supported nodes. Uppercase letters refer to branches tested in CODEML models. Subgenera Transphlebotomus and Adlerius are sister to subgenus Larroussius, which contains vectors of L. infantum (masc: P. mascittii; hale: P. halepensis; arab: P. arabiensis; majo: P. major; negl: P. neglectus; aria and APY: P. ariasi; kand: P. kandelakii; perf: P. perfiliewi; pern: P. perniciosus; tobb: P. tobbi; arab631-3, aria193, pern491, pern490: last three numerals of GenBank accession codes of published sequences).

In contrast, the sister-branch to P. ariasi showed a well-supported gene duplication event, involving the ancestor shared by Phlebotomus kandelakii, Phlebotomus perfiliewi and the Phlebotomus perniciosus complex. The duplicate lineages paralogous and orthologous to the single-copy gene sequences of P. ariasi and five other basal species (Fig. 2) were identified as branches A/B and C, respectively, in part because amino acid similarities/identities with all six basal species were lower for branch A/B (80.5–83.8/61.7–66.2%) than for branch C (88.3–91.6/74.7–83.8%) (Table S2).

All new sandfly sequences were identified as apyrases by BLASTx searches of GenBank nucleotide sequences, with the top 20 matches (E values < 9e−07) being with CANs. This indicates the absence of major functional changes in post–gene duplication.

Rates of evolution and types of selection among apyrase phylogenetic lineages

Heterogeneity of long-term selection pressure and its direction were tested in PAML branch models to estimate the phylogenetic distribution of the relative ratio (ω) of nonsynonymous nucleotide substitutions per nonsynonymous site (dN) to synonymous substitutions per synonymous site (dS). Evidence for selection in the monophyletic lineage of P. ariasi was tested using per-codon branch lengths and a no-clock model (unrooted phylogeny), which was favoured over a global clock [Likelihood ratio test (LRT) = −308.36, df = 28, P < 0.001]. Three tests compared a null model of a single ω over all branches (Fig. 2) with a two ratio model, for which ω on the branch of P. ariasi or A or A/B was free to vary against the background of all other branches. Positive selection was detected solely on paralogous branch A (ω > 1; LRT: P < 0.001), which had only nonsynonymous substitutions. Otherwise, there was no significant heterogeneity in selection pressure (LRT: > 0.05), and with ω < 1, we conclude that the apyrases of these Phlebotomus species, including P. ariasi, are currently under predominantly purifying selection, not positive selection. By comparing two-ratio against three-ratio models, significant heterogeneity in the level of purifying selection was found between the branches A/B, C and D, with the paralogous nature of branch A/B being indicated by it having a greater proportion of nonsynonymous substitutions: (ωA/B = 0.375) > (ωC = 0.255) > (ωD = 0.143).

Random-site and Fixed-site models were also tested, because positive selection and heterogeneity in selection can be masked by averaging codons over branches. Positive selection among sites was detected by Random-site models, with codons 131, 132 and 145 identified but not statistically supported (BEB Pr (ω > 1) < 0.95). None of these codons varied in P. ariasi. Fixed-site models partitioned the data based on a priori knowledge of apyrase function (Figure S1). No positive selection was demonstrated for any class of exposed sites (nucleotide or calcium binding, increased ADPase activity and putative epitopes) compared with either the other (buried) sites or the null model of homogeneity among sites. The binding sites had nearly twice as many nonsynonymous substitutions as the buried sites, with the difference being associated with the paralogous lineage (branch A/B).

Our analysis is unlikely to suffer from type 2 errors, because our data set compared 154 codons, more than the 50–100 necessary for adequate statistical power (Anisimova et al. 2001), and at least 10 taxa with a sequence divergence (dS) of 2.45–3.06, more than the minimum requirement of 4–5 species and a summed dS over all branches on the tree of >0.5 (PAML v4.2).

Types of selection on apyrase in Phlebotomus ariasi populations

In the 459 specimens of P. ariasi characterized from 20 spatio-temporal populations, 86 nucleotide genotypes were found, composed of 47 alleles translating into 15 amino acid alleles (Table 2). The diversity of amino acid alleles might lead one to suspect balancing selection in the population from Portugal (four allele frequencies >0.05) and strong directional selection in the populations from northern Spain and southern France (one allele, 02, with frequency >0.87). However, no selection for codon usage was shown by three measures: low values of the Codon Bias Index (0.32–0.35; where 0 = unbiased, 1 = extreme bias), high values of the Effective Number of Codons (57.7–59.0, where 61 = unbiased, 20 = extreme bias) and 51.4–54.0% GC content at synonymous third codon positions. Furthermore, no population showed a significant departure from neutral expectation using the associated MK and NI tests (Table 3), indicating no long-term strong positive directional or balancing selection. The same result was obtained for a MK test of all individuals from France and Spain (N = 419, = 1.0, NI = 0.919) and the total data set (N = 459, = 1.0, NI = 0.951). Type 2 errors are unlikely for the MK test applied to populations where N = 50–100 (Zhai et al. 2009; Ochola et al. 2010).

Table 2.   Geographical variation in the frequencies of the amino acid alleles of the apyrase of Phlebotomus ariasi, showing the near fixation of allele 02 in France and northern Spain, and different predominant alleles elsewhere.
RegionMoroccoPortugalNW SpainNE SpainC Pyrenees, FranceEastern Pyrenees, FranceMassif Central (MC), Rhone and Lot valleys, France
MC (south)MCRhoneLot, France
  1. N, number of individuals.

  2. Population locations are described in Fig. 1 and Table S1. The allele predominating in each population is shaded.

  3. *Total for alleles with frequency <0.05.

AA allele
 020.2940.283 0.870 0.891 0.907 0.972 0.935 0.979 0.955 1 0.977 0.978 0.979 1 0.979 1 1 0.932 1 1
 07 0.1520.0220.0430.0190.0280.033 0.023   0.021       
 04  0.0650.0430.037 0.033 0.023 0.023   0.021     
 06   0.0220.037               
 01 0.676                   
 08  0.435                  
 09 0.087                  
 Other private (n) *0.029 (1)0.044 (2)0.043 (1)    0.020 (1)   0.021 (1)     0.068 (2)  
Table 3.   Tests showing the absence of positive or balancing selection on the nucleotide alleles of apyrase from all populations of Phlebotomus ariasi.
PopulationNShDsPsDnPnMK Fisher
NI†Fu and Li DFu and Li D
Tajima DTajima D
  1. N, sample size; S, no. of segregating sites; h, no. of alleles; Ds, Ps, Dn, Pn: no. of synonymous (s) and non-synonymous (n) substitutions for fixed (D) or polymorphic (P) sites; MK, McDonald-Kreitman test; NI, neutrality index (<1: positive selection; >1: negative selection; NA: not applicable); EW, Ewens–Watterson test; Rm, minimum no. of recombination events.

  2. Population locations are described in Fig. 1 and Table S1. < 0.05: significant.

  3. The one significant test after sequential Bonferroni correction (*) indicates a recent population expansion or selective sweep.

  4. †Tests with an out-group (Phlebotomus major).


Three tests for selection in current or recent generations (Fu & Li’s D, EW and Tajima’s D) showed no significant deviation from neutral expectation in any population (Table 3), after Bonferroni corrections for family-wise type 1 errors, thereby providing support neither for positive directional or balancing selection nor for population expansions or selective sweeps. The latter alternatives were not detected in any European population by Fu’s FS test (no significant negative values), which has relatively more power for this purpose unless S and recombination are high, when the R2 test is more appropriate (Ramírez-Soriano et al. 2008). However, this statistic was never significant, and so recent population expansions or selective sweeps can be rejected for all European populations, even though recombination was often detected (Table 3) for residues 100–139, 139–352, 433–451 and 451–475. EW and Tajima’s D tests were also not significant after pooling all French and Spanish samples (N = 419) (data not shown), and type 2 errors are unlikely for such a large population (Zhai et al. 2009; Ochola et al. 2010). Each population was in HWE at the apyrase locus, providing no support for current selection.

Neutral evolution of apyrase in Phlebotomus ariasi reflects demographic variation

The demographic processes of regional isolation and isolation-by-distance can explain the distinctive patterns of allele diversity and frequency characterizing the populations sampled. Amino acid allele frequencies (Table 2) differentiated the Morocco and Portugal populations from each other and from Spain-plus-France. Additionally, hierarchical amova (Table S3) supported regional groupings in France and north-east Spain (Fig. 1), matching those given by the neutral loci elongation factor-1α and mitochondrial cytochrome b (Mahamdallie et al. 2011).

Genetic distances between pairs of populations were measured by FST, which showed significantly ‘very great’ differentiation (Wright 1978) between two northern leading-edge populations in Lot, France, and populations in Morocco, Portugal, northern Spain, and some French locations (Fig. 1). With or without the leading-edge populations, there was a significant positive correlation between genetic and geographical distances (Figure S3), supporting a model of isolation-by-distance (Without Lot: < 0.001, R2 = 0.565). However, the presence of two panmictic populations separated by a barrier, the Pyrenees, is as likely as a continuous population under isolation-by-distance.


A salivary peptide could show adaptive evolution for several phenotypes, including those related to the facilitation of bloodfeeding and the control or exacerbation of an infectious disease (Belkaid et al. 1998, 2000). However, no investigation before now has aimed to test rigorously using population genetic and phylogenetic tests the occurrence of natural positive directional or balancing selection on a sandfly salivary peptide. Most sequences of sandfly salivary peptides in GenBank derive from just one (Elnaiem et al. 2005; Oliveira et al. 2006) or two (Kato et al. 2006) wild populations of a species, or more usually from a cDNA library constructed using inbred individuals from colonies (Anderson et al. 2006; Hostomská et al. 2009), and so post hoc bioinformatics have permitted only limited conclusions concerning the phylogenetic and population-level conservation of amino acid sequences, functional sites and putative epitopes. Adaptive evolution was rejected for the leishmaniasis-controlling salivary peptide PpSP15 of P. papatasi (Elnaiem et al. 2005), but it was analysed in a single wild population and only by comparing the ratio of nonsynonymous to synonymous substitutions overall and in different regions of the gene, which is inappropriate for detecting recent or contemporary selection. The first evidence of antigenic polymorphism in a sandfly salivary peptide was demonstrated by Milleron et al. (2004), who hypothesized that balancing selection might be maintaining many maxadilan alleles with equivalent vasodilatory potencies in L. longipalpis.

In contrast, we have applied to apyrase tests appropriate for detecting not only ancient selection in the Mediterranean vectors of L. infantum, based on a well-supported phylogeny (Fig. 2) of all but one (Phlebotomus longicuspis) of the regional vectors in the subgenus Larroussius (Ready 2010), but also contemporary selection in a demographically informative set of populations of one of these vectors, P. ariasi. No population test provided any support for positive directional selection, balancing selection or selective sweeps in populations of P. ariasi associated with both geographical and environmental variation (altitudes 100–1114 m.a.s.l.) in Europe (Fig. 1 and Table S1). Population genetic tests can be sensitive to type 2 errors that result from failure to reject a null hypothesis because of too few populations being sampled or small population size (Garrigan and Hedrick 2003; Spurgin and Richardson 2010; Zhai et al. 2009). However, the number of samples was adequate and the per-population tests were corroborated by testing large pooled populations. Physical linkage between adjacent gene regions can affect local neutral polymorphism, and population frequency tests can detect these perturbations in a gene genealogy (Simonsen et al. 1995). The absence of selective sweeps (i.e. no linkage disequilibrium) reassures us that adaptive selection in the recent past was not missed by failing to sequence the 5′ and 3′ regions of apyrase adjacent to our fragment. The geographical pattern of genetic variation was consistent with neutral demographic processes, mainly regional isolation and isolation-by-distance, resulting from range contractions and expansions during the glacial and interglacial periods, respectively, of the late Pleistocene (Mahamdallie et al. 2011). Finding no current adaptive selection on the apyrase of P. ariasi in southwest Europe might be explained by weak or diluted selection pressures. One explanation to be explored is that strong selection from hosts or parasites is absent because few sandflies in each subpopulation feed on infected dogs, the primary reservoir hosts, owing to opportunistic bloodfeeding on a range of other mammals and birds (Guy et al. 1984; De Colmenares et al. 1995) and the variable infection rates of dogs (Ready 2010).

Salivary peptides often occur as gene families, within and between which there is evidence for overlapping functions (Anderson et al. 2006), and therefore, serious consideration should also be given to the possibility of multilocus gene-for-gene interactions with hosts leading to arms races of quantitative traits (Sasaki 2000) in cyclical sweeps of positive selection. These can be complex and provide multiple opportunities for any one gene to escape an arms race for a period of time. Coevolutionary selection can more easily fix adaptive alleles if a trait has a simple genetic basis, like the haemostatic action of apyrase, and this can lead to the evolution of extreme phenotypes in predators that allow them to escape from an arms race (Sasaki 2000). If this applies to ectoparasitic flies, then it could also help to explain any absence of contemporary adaptive selection on the apyrase of P. ariasi. Three other families of sandfly salivary peptides have been identified as vaccine candidates (Oliveira et al. 2006, 2008; Collin et al. 2009), but there is no report on whether they are under any selection pressure.

Based on Bayesian reconstructions, the apyrase gene tree was incongruent with that for Phlebotomus nuclear elongation factor-1α (EF-1α) (Esseghir et al. 2000; Mahamdallie et al. 2011) when the subgenera Phlebotomus and Euphlebotomus were included, which indicates the presence of paralogous apyrases in these distantly related taxa. Exclusion of these out-groups produced an apyrase gene tree with a species branching order congruent with that for the conserved EF-1α gene, except for the failure to group the members of the P. perniciosus complex, P. perniciosus and Phlebotomus tobbi. Based on this robust phylogeny, we detected only once strong selection on the apyrase of Phlebotomus, when positive directional selection was associated with an ancient gene duplication event (Fig. 2). Our well-supported phylogeny demonstrated that gene duplication occurred in an ancestor to the P. perniciosus species complex, rather than being lost in P. ariasi as suggested previously (Anderson et al. 2006). Accelerated evolution of salivary peptides over geological time periods was reported for anopheline mosquitoes (Calvo et al. 2009) and can be inferred for sandflies, because some gene families are specific to them (Anderson et al. 2006). Gene duplication events are frequently but not inevitably followed by positive selection (Hahn 2009), which we observed for Phlebotomus apyrase. The subsequent long period of weak purifying selection also appears to be characteristic of insect immune peptides: ancient positive selection was detected for some immune genes of Drosophila (Jiggins and Kim 2007), but only purifying selection was reported for Anopheles (Lehmann et al. 2009). Adaptive evolution through balancing selection has not been detected for any dipteran immune peptide, and perhaps it should not be expected to act on arthropod vector salivary peptides, because it is usually associated only with parasite–mammal interactions, e.g. tsetse fly-borne sleeping sickness caused by antigen-switching Trypanosoma brucei (Young et al. 2008) and anopheline mosquito-borne malaria caused by Plasmodium species with polymorphic surface antigens (Tetteh et al. 2009).

Our findings, using a combined population genetic and phylogenetic approach applied to sandfly salivary peptides, lead us to conclude that there is an absence of contemporary positive or balancing selection on functional and antigenic sites of the apyrase of P. ariasi. This supports the use of apyrase for vaccination against leishmaniasis (Oliveira et al. 2006) in Europe: the absence of balancing selection means that there are few alleles to compare for antigenic variation when developing a vaccine and no allele is likely to be conferred a selective advantage by a vaccination programme. Moreover, the functional conservation of apyrase over geological time lends support to the possible use of apyrase as a vaccine candidate throughout the Mediterranean region.

Even in the western Mediterranean region, however, there are some important gaps in our knowledge of apyrase: contemporary selection has not been investigated for the two genes of the vectors sympatric with P. ariasi, and a healing-type DTH response needs to be confirmed for rodent models (Oliveira et al. 2006) and demonstrated for the domestic dog. The vaccination of this reservoir host is most likely to control not only a major veterinary problem but also human disease caused by the same parasite, both in Europe and Latin America (Collin et al. 2009; Ready 2010). Sandfly apyrases belong to the Cimex family, which includes human homologues (Valenzuela et al. 1998). Therefore, any vaccine faces the challenge of using apyrase to prevent Leishmania establishment without undermining host haemostasis. However, it is encouraging that key differences exist in the nucleotidase activity preferences and some amino acid functional sites of the sandfly and human apyrases (Hamasaki et al. 2009), despite the similarities of these calcium-dependent extracellular enzymes. Also encouraging is the finding that the blocking of adenosine function, the product of apyrase’s nucleotide hydrolysis, can decrease Leishmania parasitism (De Almeida Marques-da-Silva et al. 2008). So far, the control of leishmaniasis by a Th1-type CMI response stimulated by a specific salivary peptide (not apyrase) has been experimentally demonstrated for only one natural cycle, namely the transmission of L. infantum to the domestic dog by the neotropical vector L. longipalpis (Collin et al. 2009). This name is used for a complex of sandfly sibling species, and not all their populations have the same vectorial competence. Some of this geographical variation might well be explained by the natural variation in salivary peptides (Milleron et al. 2004). Similarly, our findings will also be of value to those investigating geographical variation in arthropod vector competence in relation to the potential of salivary peptides for disease control.


We thank Ana Aransay, Samia Boussaa, Robert Farkas, Montserrat Gállego, Parviz Parvizi and Carlos Pires for specimens, Alison Cownie and Alex Aitken for laboratory assistance, Julia Llewellyn-Hughes and Claire Griffin for DNA sequencing, Peter Foster for PAML training, Jesus Valenzuela for advice on salivary peptides and Bernard Pesson for unstinting support of field work. This research was funded by EU grant GOCE-2003-010284 EDEN and is catalogued as publication EDEN0269 (http://www.eden-fp6project.net/). It does not necessarily reflect the views of the European Commission.

Data archiving statement

Nucleotide sequence data reported in this article are available in the GenBank database under Accession Numbers JF899926JF900009.