Dissecting functional components of reproductive isolation among closely related sympatric species of the Anopheles gambiae complex

Abstract Explaining how and why reproductive isolation evolves and determining which forms of reproductive isolation have the largest impact on the process of population divergence are major goals in the study of speciation. By studying recent adaptive radiations in incompletely isolated taxa, it is possible to identify barriers involved at early divergence before other confounding barriers emerge after speciation is complete. Sibling species of the Anopheles gambiae complex offer opportunities to provide insights into speciation mechanisms. Here, we studied patterns of reproductive isolation among three taxa, Anopheles coluzzii, An. gambiae s.s. and Anopheles arabiensis, to compare its strength at different spatial scales, to dissect the relative contribution of pre‐ versus postmating isolation, and to infer the involvement of ecological divergence on hybridization. Because F1 hybrids are viable, fertile and not uncommon, understanding the dynamics of hybridization in this trio of major malaria vectors has important implications for how adaptations arise and spread across the group, and in planning studies of the safety and efficacy of gene drive as a means of malaria control. We first performed a systematic review and meta‐analysis of published surveys reporting on hybrid prevalence, showing strong reproductive isolation at a continental scale despite geographically restricted exceptions. Second, we exploited our own extensive field data sets collected at a regional scale in two contrasting environmental settings, to assess: (i) levels of premating isolation; (ii) spatio/temporal and frequency‐dependent dynamics of hybridization, (iii) relationship between reproductive isolation and ecological divergence and (iv) hybrid viability penalty. Results are in accordance with ecological speciation theory predicting a positive association between the strength of reproductive isolation and degree of ecological divergence, and indicate that postmating isolation does contribute to reproductive isolation among these species. Specifically, only postmating isolation was positively associated with ecological divergence, whereas premating isolation was correlated with phylogenetic distance.


| INTRODUCTION
Speciation proceeds by the progressive establishment of reproductive isolation among genetically distinct populations. Reproductive isolation results from the combined effect of all barriers to gene flow, which can be conveniently classified into three functional components (Mallet, 2006): (i) natural selection against immigrants from alternative ecological niches; (ii) sexual isolation due to assortative mating or fertilization; and (iii) selection against zygotes formed when hybridization does occur. The latter mechanism can be further partitioned according to whether selection results from the presence of intrinsic genetic incompatibilities, expressed in the form of hybrid sterility or inviability independent of the environment, or rather from inferior hybrid genotypes as a result of ecological mechanisms, a process also known as extrinsic postmating isolation. Barriers to gene flow are therefore multifarious, with the accumulation of multiple mechanisms in the course of population divergence gradually producing an increase in the nature and strength of isolation. Accordingly, time since lineage divergence is a reasonable predictor of the strength of reproductive isolation, as closely related taxa exhibit weaker isolation compared to distantly related ones (Coyne & Orr, 1998;. While this relationship holds true noisily, several evolutionary processes can distort it when considering closely related taxa issuing from recent adaptive radiations: at this phylogenetic level, the strength of reproductive isolation can be modulated by geographical, historical or ecological factors that weaken or strengthen the barriers against gene flow. Explaining how and why reproductive isolation evolves and, in particular, determining which forms of reproductive isolation have the largest impact on the process of divergence, are major goals in the study of speciation. The relative importance of different categories of reproductive isolation in the process of speciation, however, has been the focus of much debate , 1997Rice & Hostert, 1993;Sobel, Chen, Watt, & Schemske, 2010). One of the main issues lies in the sequential nature of isolation mechanisms at different temporal scales. From a chronological perspective, measures of the strength of reproductive isolation should account for the fact that postmating barriers intervene after premating barriers have already filtered the potential for hybridization (Sobel & Chen, 2014). Moreover, several lines of evidence show that the strength of postmating isolation varies in relation to the degree of ecological divergence, whereas premating isolation follows the general pattern of increase with time since lineage divergence (Turelli, Lipkowitz, & Brandvain, 2014). At an evolutionary level, reproductive barriers continue to accumulate even after speciation is complete, so that the strength and nature of isolation in extant species may be radically different from that existing during early divergence (Sobel & Streisfeld, 2015). The study of recent adaptive radiations of partially isolated taxa allows identifying barriers involved in their initial divergence. This permits the separation of these barriers from other factors emerging when speciation is complete. Hence, studies of recently diverged taxa provide the most accurate picture of the barriers involved in speciation, and comparative analysis of multiple, recently diverged species may be the only way to assess the order in which barriers typically arise (Sobel et al., 2010).
In this respect, isomorphic members of the afro-tropical Anopheles gambiae sensu lato (s.l.) complex, which includes eight members exhibiting different degrees of reproductive isolation and ecological divergence, offer an opportunity to provide insights into the mechanisms of emergence of reproductive isolation and hybridization, whose implications are not only of interest to evolutionary biologists but also of importance for malaria control. Current malaria control strategies are implications for how adaptations arise and spread across the group, and in planning studies of the safety and efficacy of gene drive as a means of malaria control. We first performed a systematic review and meta-analysis of published surveys reporting on hybrid prevalence, showing strong reproductive isolation at a continental scale despite geographically restricted exceptions. Second, we exploited our own extensive field data sets collected at a regional scale in two contrasting environmental settings, to assess: (i) levels of premating isolation; (ii) spatio/temporal and frequency-dependent dynamics of hybridization, (iii) relationship between reproductive isolation and ecological divergence and (iv) hybrid viability penalty. Results are in accordance with ecological speciation theory predicting a positive association between the strength of reproductive isolation and degree of ecological divergence, and indicate that postmating isolation does contribute to reproductive isolation among these species. Specifically, only postmating isolation was positively associated with ecological divergence, whereas premating isolation was correlated with phylogenetic distance.

K E Y W O R D S
ecological speciation, hybridization, postmating, premating, reproductive isolation, species complex based on insecticides, which exert strong selective pressures on vector populations, affecting their ecology and behaviour (Kitau et al., 2012;Russell et al., 2011). Innovative vector-borne disease control strategies are based on population suppression or replacement by genetic technologies, which rely on driving genes of interest into natural mosquito populations. In both instances, adaptive introgression through hybridization can accelerate the evolution of insecticide resistance or gene drive across semipermeable species barriers (Norris et al., 2015).
Three species of the complex-An. gambiae sensu stricto (s.s.), Anopheles coluzzii and Anopheles arabiensis-have the widest range of distribution and represent the most efficient vectors of human malaria in sub-Saharan Africa, where they are responsible for hundreds of millions of infections and hundreds of thousands of deaths (WHO, 2016). The sibling species An. gambiae s.s. and An. coluzzii are believed to have split about 540,000 years ago and represent the most recent radiation in the species complex (Fontaine et al., 2015;Kamali, Xia, Tu, & Sharakhov, 2012). They correspond to two assorting taxonomic units provisionally named "molecular forms S and M" defined based on fixed mutations on the chromosome-X-linked rDNA intergenic spacer (IGS) (della Torre et al., 2001) and have only recently received formal Linnaean nomenclature (Coetzee, Wilkerson, della Torre, Coulibaly, & Besansky, 2013), based on evidence showing that their genomes contain regions of differentiation resulting in exclusive taxonomic clustering across much of their shared geographical range Neafsey et al., 2010;Reidenbach et al., 2012). To avoid confusion in the nomenclature, we chose to use the definition "An. gambiae (Giles)" for both taxa collectively, and "An. gambiae s.s." with reference to the single taxon formerly named as molecular form S. Genomic studies have clearly demonstrated that An. gambiae s.s. and An. coluzzii ancestor separated from other species in the group approximately 2 million years ago (Ma) and that the two species are distantly related to An. arabiensis (Fontaine et al., 2015;Kamali et al., 2012). The latter species can be distinguished from An. gambiae (Giles) by diagnostic fixed paracentric chromosomal inversions on the X heterosome (Coluzzi & Sabatini, 1967;White, 1971), as well as by diagnostic DNA-based differences in the IGS region (Scott, Brogdon, & Collins, 1993). Despite a history of extensive introgression among the three species (Fontaine et al., 2015), these are behaviourally, physiologically, ecologically and epidemiologically distinct and segregate along major ecoclimatic gradients at a continental scale. Overall, An. arabiensis and An. gambiae s.s. are sympatric across sub-Saharan Africa, but the former species dominates in more xeric areas and at higher altitudes, while it is virtually absent in the moist Guineo-Congolese rainforest block of Western and Central Africa (Coetzee, Craig, & Le Sueur, 2000;White, 1974). Anopheles coluzzii is only found west of the Rift Valley where it extensively overlaps with An. arabiensis in savannah areas . Conversely, there is evidence of strong segregation between An. gambiae s.s. and An. coluzzii, with a prevalence of the latter in ecologically more complex and more stable anthropogenic larval habitats (Gimonneau et al., 2012;Kamdem et al., 2012). Overall, therefore, An. coluzzii appears ecologically more similar to An. arabiensis compared to An. gambiae s.s.  suggesting that An. coluzzii is likely to have diverged from An. gambiae s.s. and to have invaded, during this process, the ecological niche of An. arabiensis.
All the siblings of the complex freely interbreed in laboratory cages; however, they show different degrees of postzygotic incompatibilities. The hybrid progeny resulting from crossing An. arabiensis with An. gambiae (Giles) (i.e., not distinguishing between An. gambiae s.s. and An. coluzzii) are generally fit and viable, but hybrid males are sterile, and backcross females show reduced fertility due to several incompatible alleles (Slotman, della Torre, Calzetta, & Powell, 2005;Slotman, della Torre, & Powell, 2004). The rarity of An. arabiensis × gambiae (Giles) individuals in natural field populations (Touré et al., 1998;White, 1971) is probably a consequence of strong (albeit incomplete) premating isolation and of selection acting against these hybrids (Mallet, 2005). Although experimental evidence is lacking, premating behavioural differences among An. arabiensis, An. coluzzii and An. gambiae s.s. imperfectly enforce isolation, given that mixed swarms are sometimes found in nature (Dabire et al., 2013). Conversely, there is no evidence of genetic incompatibilities in experimental crosses between An. coluzzii with An. gambiae s.s.: hybrids are viable and fertile, with no obvious loss in fitness in a laboratory setting (Diabaté et al., 2005(Diabaté et al., , 2007Hahn, White, Muir, & Besansky, 2012;Sawadogo et al., 2014). Assortative mating between An. coluzzii and An. gambiae s.s. is imperfectly maintained in nature (Tripet et al., 2001) and periodically breaks down, resulting in extensive hybridization Costantini et al., 2009;Lee et al., 2013;Oliveira et al., 2008) and in detectable levels of introgression and current gene flow (Marsden et al., 2011;Reidenbach et al., 2012;Weetman, Wilding, Steen, Pinto, & Donnelly, 2012). Several premating behavioural mechanisms such as spatial segregation of mating swarms (Diabaté et al., 2009), complex short-range acoustic recognition responses (Pennetier, Warren, Dabiré, Russell, & Gibson, 2010;Simões, Gibson, & Russell, 2017) and lags in circadian activity associated with reproductive behaviour (Sawadogo et al., 2013) might contribute to diminish heterospecific inseminations. It is plausible that reduced hybrid fitness mediated by extrinsic ecological factors, as opposed to intrinsic genetic incompatibilities, may further contribute to reproductive isolation, making these species another example of ecological speciation.
In recent species radiations like this one, ongoing hybridization and introgression may be a means of spreading adaptive traits across species boundaries (Fontaine et al., 2015). In the case of the An. gambiae species complex of malaria vectors, this may entail the exchange of genes that have an important bearing on aspects of vectorial capacity, including blocks of genes inside chromosomal inversions that have been implicated in aridity tolerance, and insecticide or parasite resistance genes (Clarkson et al., 2014;Djogbénou et al., 2008;Mancini et al., 2015;Norris et al., 2015;Sharakhov et al., 2006;White et al., 2011). The existence of interspecific gene flow even between nonsister species potentially accelerates adaptive change and speeds the development of resistance to vector control tools, but also may beneficially broaden the target populations susceptible to gene drive strategies aimed at reducing malaria transmission (Knols, Bossin, Mukabana, & Robinson, 2007;Scott, Takken, Knols, & Boete, 2002). For this reason, an understanding of the dynamics and the ecological conditions favouring hybridization between taxa in this group is of paramount importance.
Here, we take advantage of published data at continental scale as well as of extensive original field data sets collected at a regional scale in two contrasting environmental settings to dissect the different functional components of reproductive isolation among An. coluzzii, An. gambiae s.s. and An. arabiensis (from pre-to postmating isolation and ecological divergence) at different spatial scales. Altogether, our results point to the dynamic nature of reproductive isolation and show how different genetic, behavioural and ecological components interact in a complex way to modulate its strength.

| Frequency of hybrids in field populations at a continental scale: a meta-analysis
We performed a systematic search and review of field studies indexed in the publication repository PubMed (http://www.ncbi.nlm.nih.gov/ pubmed). We looked for studies reporting on the occurrence and frequency of An. gambiae s.s. and An. arabiensis populations in sympatry that were published between 1964 and March 2013. Articles were searched using the keywords "gambiae AND larvae," or "gambiae AND adults," in association with any of the following keywords: "distribution," "frequency," "identification," "sympatry," "hybrid," or "molecular form." From the results of the search, we retrieved the following information: country of collection, total number of specimens collected (females), number of hybrids identified, number of localities where the specimens were collected, and the name and geographical coordinates of the sites, whenever available. Articles reporting about specimens collected as larvae that were subsequently grown in insectaries were not included in the analysis to avoid biases due to rearing in artificial conditions.
As the status and nomenclature of the taxonomic subdivisions within An. gambiae s.s. have changed, we chose to classify results from articles that did not distinguish between the M and S molecular forms of An. gambiae s.s. (e.g., all those published before 2001) as An. gambiae (Giles), and results from studies referring to the M and S forms as An. coluzzii and An. gambiae s.s., respectively. When only the chromosomal form-not the molecular form-was defined, results were classified as An. gambiae (Giles) as the taxonomic subdivisions within each of these two classification systems do not coincide.
Four data sets were assembled, two sets to study the prevalence of An. arabiensis × gambiae (Giles) hybrids in larval and adult samples, and two sets for An. coluzzii × gambiae s.s. larval and adult hybrids, respectively. A meta-analysis was performed on each data set to test whether the prevalence of hybrids was homogeneous across studies, that is whether there was a "study effect" that might be related to geographical heterogeneities. In cases when there was no study effect, studies could be combined to provide an overall estimate of the prevalence of hybrids. Homogeneity across studies was assessed by the Cochran Q and I 2 statistics (R Core Team, 2013;Viechtbauer, 2010). Whenever the p-value of Cochran Q was greater than 5% and it was associated with a value of I 2 ≤ 25%, the studies were considered homogeneous and the relative frequency of hybrids was calculated considering a fixed-effects combined value. Conversely, if the studies were significantly heterogeneous, a random-effects combined value was calculated. Under the fixed-effects model, it is assumed that the true effect is the same in all studies, while under the random-effects model allowance is made for the true effect to vary across studies (Higgins, Thompson, Deeks, & Altman, 2003).

| Assessment of premating isolation
In anopheline mosquitoes, after copulation male sperms are stored by the female inside a spermatheca, from which they are later retrieved to fertilize eggs during oviposition (Clements, 1999). Thus, as members of the An. gambiae complex are essentially monandrous or nearly so (Tripet et al., 2001), it is possible to know whether an individual female copulated with a conspecific or heterospecific male by mo- The choice of locations was justified by several reasons: first, the sympatric occurrence of the taxa in comparable frequencies within each population, as expected from species distribution models developed previously for both countries Kamdem et al., 2012;Simard et al., 2009). This condition maximizes the opportunity for contact between taxa (and indeed was found to correlate with the highest frequency of hybrids in the population; cf. Section 3). Second, the high prevalence of An. coluzzii × gambiae s.s. hybrids recorded from small samples collected in two of these localities (Koukoulou and Kougoulapaka) during a countrywide survey carried out the year before sampling . Third, the different geographical mode of contact between An. coluzzii and An. gambiae s.s. in the two countries: in the arid savannah of Burkina Faso, the two taxa are essentially sympatric , whereas in the forested area of Yaoundé in southern Cameroon An. coluzzii and An. gambiae s.s. are parapatric as they segregate steeply along urbanization clines (Kamdem et al., 2012). Finally, the two environmental settings reflect different genetic backgrounds of the two species; in particular, chromosomal polymorphisms are markedly different in these two ecogeographical areas. While we could not explicitly test for the individual effect on sexual isolation of each of these variables (degree of overlap, habitat or chromosomal polymorphism) due to collinearity and lack of replication, we included a differentiated panel of populations to study general mechanisms of premating isolation.
Following the protocol of Tripet et al. (2001), adult females morphologically identified as An. gambiae s.l., or their spermatheca isolated from the rest of the carcass, were individually preserved in 1.5-ml microtubes containing 70% ethanol for ≥2 days to agglutinate the sperm in the form of a bundle. Mosquito carcasses were put in individual microtubes containing a desiccant (silica gel). After careful separation of the sperm mass from the chitinous shell of the spermatheca under a dissecting microscope, the sperm bundle was transferred in a microtube for molecular analysis. The identity of the female and associated paternity of the sperm were obtained by molecular identification following a PCR-RFLP protocol (Fanello et al., 2003) performed on a leg from each mosquito carcass and from DNA extracted from the sperm bundle with 5% Chelex (Walsh, Metzger, & Higuchi, 1991) or the Qiagen "DNeasy" extraction kit according to manufacturer's instructions. Specimens that did not return interpretable bands by the direct PCR were further processed by extraction of DNA from two legs by CTAB (Cornel & Collins, 1996), and the PCR repeated.
We have assessed the degree of assortative mating from indices of pair sexual isolation (I PSI ) using the software JMATING (Carvajal-Rodriguez & Rolan-Alvarez, 2006). I PSI is a safe estimator of sexual isolation for field samples, as it is not affected by small variances across different scenarios of frequencies of the taxa considered, nor by mate propensity, that is the intrinsic tendency for some individuals/populations/taxa to mate more frequently than others (Pérez-Figueroa, Cruz, Carvajal-Rodríguez, Rolán-Alvarez, & Caballero, 2005). Mean bootstrap values, standard deviations and two-tail probabilities for rejecting the null hypothesis (I PSI = 0, i.e., random mating) were calculated by bootstrap (100,000 resamplings, also for premating frequencies).
If two taxa mate partially at random, the frequency of homogamous mating depends on the probability of meeting between sexes of the same taxon. This probability depends on the frequency of individuals of the focal taxon in the whole population. The higher the frequency, the higher is the probability that conspecific individuals of the focal taxon will meet, and consequently levels of heterogamy will be lower. To test this hypothesis, we have fitted generalized linear models to the I PSI indices using the relative frequency in the population of the pair of taxa whose sexual isolation was under scrutiny as a covariate; the focal taxa were fitted as a factor and its interaction with the covariate tested in an ANCOVA-like arrangement I PSI = FREQUENCY * TAXA, which is formally: where the pair sexual isolation index of species pair i, j at location k is equal to the sum of linear components, with α and β representing the intercept and slope of the linear relationship, p ijk the frequency at location k of species i relative to species j, and ε are sampling errors around the regression line. We have assumed that errors are distributed normally and fitted GLMs with identity link functions. We tested the statistical significance of the interaction and main effects of the FREQUENCY and TAXA variables by likelihood-ratio tests on the change in deviance associated with removal of nested model terms, and searched for the minimal adequate model by means of the Akaike information criterion. While a parabolic relationship between I PSI and frequency, with a minimum at intermediate values and two local maxima at the opposite ends of the covariate scale, may better represent the dependence of sexual isolation on the relative abundance of the focal taxa, the limited size of our data set and range of observed p ijk values oriented our analysis towards a linear model, which can approximate parabolic-like relationships at one end of the covariate scale.

| Spatial and temporal patterns of hybridization in a "contact zone"
We exploited the large data set available along an urbanization gradient in the capital of Cameroon-Yaoundé (Kamdem et al., 2012), to analyse in more detail the spatial and temporal distribution of Sampling locations were separated by ~1 km each. Adult mosquitoes were collected by spraying an insecticide aerosol inside the sleeping rooms of 1,481 households, whereas larvae were collected by dipping from 3,421 aquatic habitats. Temporal clustering of hybridization was tested with the rank version of the von Neumann ratio test for randomness (Bartels, 1982) using the R package lawstat (Hui, Gel, & Gastwirth, 2008).

| Frequency of hybrids at a regional scale in populations from Burkina Faso and Cameroon
Similarly, we aimed to investigate how the spatial distribution of hybridization relates to the relative frequency of the parental species at a larger (regional) spatial scale using extensive data sets col- Molecular identification of individual specimens was performed using one of several possible diagnostic PCR protocols (Fanello et al., 2003;Favia, Lanfrancotti, Spanos, Sidén-Kiamos, & Louis, 2001;Santolamazza et al., 2008). It is worth noting that all the samples from Burkina Faso used in this analysis are from indoor-resting I ijk PSI = α ij +β ij p ijk +ϵ mosquitoes, a feature that relieves difficulties of interpretation due to the potential presence of a cryptic subgroup of An. gambiae (Giles) in this study area (Riehle et al., 2011) (for more details, see Appendix S1).
To verify whether the prevalence of hybrids changed as a function of the relative abundance of the parental taxa in the population, we have fitted generalized additive models (GAM) assuming that the fre- where A and B represent the absolute frequencies of the parental species. The models assume that the logarithm of the odds of the probability of hybridization h i is a linear function of the relative frequency of the parental taxa p i in each locality, that is: log e (h i ∕(1 − h i )) = α+ϕ(p i ) + , where ϕ is a nonparametric smoothed function, and ε represent the sampling error. We estimated h i and p i from the sample estimates Because hybridization in these regions is rare, we did not take into account the frequency of hybrids in the denominator of p i. We fitted GAM with the R package mgcv (Wood, 2006), specifying a binomial error structure and a logit link function. This approach has the added benefit of weighing each locality by its sample size. The statistical significance of the inclusion of the nonparametric smoothed function in the model was tested by the difference in AIC and likelihood-ratio test associated with removal of the nonparametric term from the model.

| Assessment of postmating isolation
To assess the relative contributions of pre-versus postmating isolation, we modelled the prevalence of hybrids in a population in relation to the proportion of individuals that mate at random according We estimated the parameters s and s* by fitting nonlinear weighted least-squares regressions to the observed frequency of heterogamous mating or hybrids in a population, according to the F I G U R E 1 Contribution of premating versus postmating barriers towards the prevalence of hybrids (h) under a mode of frequencydependent hybridization; p is the proportion in the population of one of the two hybridizing species. The parameter s represents the proportional decrease in hybrids from Hardy-Weinberg expectations due to assortative mating-a measure of the strength of the premating barrier. When s = 1, mating is random; that is, there is no premating isolation. The parameter s* represents the proportional decrease in hybrids from Hardy-Weinberg expectations due to total reproductive isolation. The fraction s*/s denotes the proportion of viable hybrid zygotes, which accounts for the proportional decrease in hybrids from Hardy-Weinberg expectations due to postmating isolation. The example shows the disproportionate contribution of the premating barrier to total reproductive isolation when both s and s*/s are equal to 0.2, which is due to the sequential nature and different scaling properties of the two processes. Other cases, not shown, are equally possible

| Ecological divergence among members of the Anopheles gambiae complex
One of the predictions of ecological speciation theory is that reproductive isolation should correlate directly with the extent of ecological divergence (Funk, Nosil, & Etges, 2006). We aimed to verify this prediction in our biological system by quantifying the degree of ecological divergence among the three species under study, and comparing it to the strength of pre-and postmating isolation assessed from previous analyses. To assess the degree of ecological divergence, we calculated bootstrapped Pianka indices of niche overlap with the R package spaa (Ulrich & Gotelli, 2010;Zhang, 2004). The Pianka index is a symmetrical index assuming values between 0 and 1, derived from the competition coefficients of the Lotka-Volterra equations (Pianka, 1974):

| RESULTS
We examined reproductive isolation among An. coluzzii, An. gambiae s.s. and An. arabiensis with a multipronged approach, taking as indicator of hybrid genotypes the heterozygous patterns of species-specific diagnostic markers in the pericentric region of chromosome-X (Fanello et al., 2003;Favia et al., 2001;Santolamazza et al., 2008). First, we performed a systematic review and meta-analysis of published surveys reporting on hybrid prevalence, which overall showed strong reproductive isolation at a continental scale despite geographically restricted exceptions. Second, we exploited our own extensive field data sets collected at a regional scale in two contrasting environmental settings, that is in the arid savannah of Burkina Faso in West Africa, and in the moist rainforest of Cameroon in Central Africa, to analyse: (i) levels of premating isolation in the species triplet; (ii) the spatio/ temporal and frequency-dependent dynamics of hybridization, (iii) the relationship between postmating isolation and ecological divergence and (iv) hybrid viability penalty.

| Hybrid prevalence at a continental scale
To assess the level of hybridization among An. coluzzii, An. gambiae s.s.
and An. arabiensis in the African continent, we performed a systematic review of field studies reporting the occurrence and frequency of populations in sympatry that were published between 1964 and March 2013. and 0.44% for larvae and adults, respectively) but not homogeneous across studies/sites (Table 1), as expected due to the known existence of a restricted "high hybridization area" (HHA) at the westernmost limit of their distribution range (Caputo et al., 2008(Caputo et al., , 2014Marsden et al., 2011;Oliveira et al., 2008). It is important to note that in the core of HHA in Guinea Bissau, IGS-heterozygous patterns cannot be taken as proxy of F1 hybrids as in the rest of Africa . In fact, genomic studies revealed that these patterns result from high hybridization rates stable since at least the 1990s (Oliveira et al., 2008) leading to a novel "hybrid form" (Vicente et al., 2017) with polymorphic ribosomal IGS alleles due to recombination within chromosome-X centromeric regions as well as within the multicopy ribosomal region (Caputo et al., 2016).  in An. coluzzii, the difference being not significant (G test: χ 2 = 1.50, df = 1, p = .221).

| Premating isolation at local scales
As expected, all indices of pair sexual isolation (I PSI ) significantly departed from random mating (p ≪ .001), with values >0.8 and mostly close to complete assortative mating (i.e., I PSI = 1; Figure 2). In most localities, premating isolation was stronger between An. arabiensis vs.

| Spatial and temporal dynamics of hybridization along a "contact zone" in Cameroon
To observed hybridization rates increased in July, peaked in August, to decline later on below the threshold of detection after October (Table S7). Spatially, hybrids occurred in localities 8-7-5-4 of fig. 1E in Kamdem et al. (2012), in that temporal order, which is consistent with the "contact zone" of An. coluzzii and An. gambiae s.s. observed along the urbanization gradient (Kamdem et al., 2012 (Figures 3 and 4). Accordingly, it can be inferred that the expected prevalence of hybrids in a population is highest when the parental taxa occur at comparable frequencies, that is when their degree of contact is maximal. In all instances, the nonparametric smoothers accounted for a significant proportion of the deviance explained, which was comparable across species pairs, and models including the smoothers showed the lower AIC compared to the null model with only the parametric term α (Table 3). These results justify the H-W approach for assessment of postmating isolation from these data sets, as detailed below (Section 3.6). showing a strong deficit of hybrids and excess of parental taxa (Table 4).

| Postmating isolation at local scales
Similarly, the analysis of An. coluzzii hybridization with An. gambiae s.s. in southern Cameroon showed that the frequency of heterospecific mating was significantly greater than the prevalence of larval hybrids (Pearson's chi-square test with Yates' continuity correction: χ 2 = 11.9; df = 1; p < .001), even more so in the case of adult samples, given that not a single An. coluzzii × gambiae s.s. hybrid was observed out of 1,465 adult mosquitoes of the two species collected in the area (Kamdem et al., 2012).
These results suggest that postmating isolation contributes to the reproductive barrier among these taxa. However, the evaluation of the strength of postmating isolation relative to the premating barrier is hampered by the large sampling error associated with results based on surveys from single localities and on relatively small samples (e.g., in Burkina Faso, only four hybrids were observed despite >2,700 mosquitoes collected and analysed, see Table 4), as well as by the dependence of hybridization on the frequency of the parental species, as demonstrated from previous analyses. Therefore, to provide an average estimate of the strength of postmating isolation relative to premating isolation, we have also implemented the alternative analytical approach represented by the H-W model, as detailed below.

| Postmating isolation and ecological divergence at regional scales
Estimates of the s and s* parameters of the H-W model are presented in

| DISCUSSION
In this work, we provide a view of the emergence and modulation of reproductive isolation among three sympatric members of the First, we have performed a systematic review and meta-analysis of published surveys reporting on hybrids prevalence, which has also served to evaluate the variability and strength of reproductive isolation at a continental scale. Then, we have analysed our own field data sets collected at a regional scale in two contrasting environmental settings, that is in the arid savannah of Burkina Faso in West Africa, and in the rainforest of Cameroon in Central Africa. In this way, we have gathered the most exhaustive data set to date of hybrids occurrence and frequency in this species triplet, which has facilitated comparisons of the expected prevalence of hybrids in locales where we also obtained field measures of interspecific mating events. The emerging pattern from these results is that in this species triplet sexual isolation is positively correlated with phylogenetic distance. However, in the case of postmating isolation, we have found evidence for the occurrence of ecologically based modulation of the strength of the reproductive barrier. Accordingly, we have also looked for evidence of selection against hybridization by comparing the prevalence of hybrids in larval and adult samples, with the expectation that in the presence of selection against hybrids, the prevalence should decrease across stages as a consequence of hybrid mortality.
The main conclusions are further discussed below. It must be underlined, however, that rare events are challenging to study, and despite the analysis of several tens of thousands mosquitoes and an extensive data set gathered during more than 10 years of field studies across Africa, our results are still based only on some tens to a few hundreds of hybrid individuals. Moreover, the identification of hybrids was based on typing only the species-diagnostic X-linked rDNA locus, which cannot provide information about advanced generation hybrids and levels of population admixture. These limitations should be kept in mind when gauging the level of confidence we can attach to these results. On the other hand, our data set is the most comprehensive to date to allow such kind of analyses.

| Ecological postmating isolation
Support for the occurrence of extrinsic postmating isolation in this species trio has emerged from two pieces of evidence. On the one hand, the observed frequency of hybrids in populations at both local and regional scales was less than that expected from the observed frequency of heterogamous mating among these species.  An. coluzzii vs. gambiae s.s.
Parental taxa relative frequency, p Proportion of hybrids, h T A B L E 3 Results of generalized additive models fitted to the frequency of hybrids in field samples from Burkina Faso and Cameroon. The models test the significance of nonparametric smoothed functions of the explanatory variable "relative frequency of the parental taxa" p on the response variable "frequency of hybrids" h (cf. Figures 3 and 4). Statistical significance in each case was assessed by likelihood ratio tests (D) and by the difference in the Akaike information criterion (∆AIC) among the model containing the nonparametric smoothed function and the null model without the smoother. Negative values of ∆AIC denote that the AIC of the model with the smoother was lower than that of the model without the smoother. n: number of samples; edf: equivalent degrees of freedom; χ 2 : chi-square value; p: p-value  (Dao et al., 2014;Tene Fossog et al., 2014;Gimonneau et al., 2012;Lehmann & Diabate, 2008 (Birkhead & Pizzari, 2002;Price, Kim, Gronlund, & Coyne, 2001). However, it has been suggested that both sexual conflict and T A B L E 4 Estimated frequency of Anopheles gambiae s.l. taxa based on rates of homospecific and heterospecific mating in three populations from Burkina Faso

An. coluzzii (C)
An. gambiae s.s. many forms of sexual selection are expected to operate independently from the environment, so that other forms of cryptic postmating isolation mechanisms that occur after copulation but before hybrid offspring are produced, like reduced female fitness resulting from heterogamous insemination, better explain those instances in which taxa exhibit stronger reproductive isolation in ecologically divergent populations (Nosil & Crespi, 2006).

| Frequency-dependent hybridization
Overall, we found that higher prevalence of hybrids in natural field populations was associated with locales where the relative abundance of the parental taxa was closer to 1:1 proportions. This frequencydependent pattern of hybrid prevalence is concordant with the frequency-dependent pattern of sexual isolation observed when considering An. gambiae s.s. as one of the parental taxa. The index of sexual isolation decreased as the relative abundance of this species in the population was closer to 50% relative to An. coluzzii or An. arabiensis.
In this case, therefore, a higher prevalence of hybrids in the population is expected at intermediate frequencies of the parental taxa because under these circumstances premating isolation is also weaker.
In the case of An. arabiensis and An. coluzzii, there was no evidence of frequency-dependent sexual isolation, although in this case it is statistically more difficult to demonstrate such an effect, given that the strength of premating isolation between these species remained >97% regardless of the relative abundance of the parental taxa in the population.
Beyond the effect due to sexual isolation, the higher prevalence where the fitness of the parental species is greater, such as at the opposing ends of an ecological gradient to which each parental populations is specifically adapted (Kirkpatrick, 2001;Seehausen, 2004;Stankowski, 2013). At the ends of the gradient, therefore, each species will alternatively predominate over the other. It is in the centre of the law (these taxa have been identified by a single X-linked SNP), and the observed pattern may simply reflect the effect of sampling (binomial) errors around the fitted model curve. The same pattern has also been observed in the HHA (Marsden et al., 2011), and it was interpreted as asymmetric mating events in which rare females from An. coluzzii mate with An. gambiae s.s. more easily because no An. coluzzii males are to be found. There is also another observation from our data set that is compatible with this explanation: hybrids co-occur more with An. gambiae s.s. than An. coluzzii (Table 6), which may suggest that postmating isolation (or hybrids dispersal) is asymmetrical (i.e., stronger when An. coluzzii is more abundant). Unfortunately, our data set is probably too noisy to test this hypothesis.

| Geographical mosaic of reproductive isolation and narrow bands of hybridization
The local (regional) pattern of hybrid prevalence described above appears embedded within continent-wide differences in hybridization rates. We have quantitatively confirmed the occurrence of geographical heterogeneities between An. coluzzii and An. gambiae s.s. consistent with the "geographical mosaic of reproductive isolation" scenario highlighted by Lee et al. (2013), which is likely the product of the complex interactions among several-mostly unknown-factors. These mosaics can become apparent when data from areas sampled exhaustively are available, such as the case of Burkina Faso (this work) or Mali (Lee et al., 2013).
On the one hand, at a regional-scale hybridization appears greater in the map of their Figure 3). The rate of hybridization appeared to vary as a function of the degree of contact among taxa not only spatially, but also temporally. In the rain forest of Cameroon, for example, the degree of contact between An. coluzzii and An. gambiae s.s. weakens when An. gambiae s.s. populations collapse during drier periods due to lower availability of suitable rain-dependent larval habitats (Tene Fossog et al., 2014;Kamdem et al., 2012). In the xeric savannah of West Africa, An. gambiae s.s. local populations extinguish during the dry season probably migrating towards more humid habitats (Dao et al., 2014) while An. coluzzii continues to breed in more permanent anthropogenic larval habitats like irrigated fields, artificial water reservoirs or rice paddies (Gimonneau et al., 2012).
At a continental scale, it is possible to hypothesize that also the HHA in Guinea Bissau and Senegambia has originated from secondary contact between populations that have evolved in allopatry (Caputo et al., 2014;Marsden et al., 2011;Pinto et al., 2013). Marsden et al. (2011) (Dabire et al., 2013), in particular when occurring in less favourable ecological contexts (Niang et al., 2015). This explanation is compatible with several outcomes of our study. First, it is significant, although admittedly conjectural because of the tiny sample, that we observed three An. arabiensis × coluzzii hybrids mating in the most nonassortative way possible, that is with one each of the three parental taxa available in the local population. This observation suggests that mate choice may be disrupted by hybridization. Second, we found that in our population from Burkina Faso An. gambiae s.s. was the least homogamous of the three taxa, suggesting that this species may be characterized by some mate choice traits leading it to behave more promiscuously than An. arabiensis or An. coluzzii. Notably, whole-genome sequencing revealed that Guinea

Bissau coastal region harbours a hybrid form characterized by an
A. gambiae-like sex chromosome and massive introgression of A. coluzzii autosomal alleles (Vicente et al., 2017).

| CONCLUSIONS
Investigating incompletely reproductively isolated taxa of the Anopheles gambiae complex, we have provided evidence that preand postmating reproductive barriers can follow different pathways in recent adaptive radiations. On the one hand, degree of premating isolation has been shown to increase in the course of lineage divergence, likely reflecting the accumulation of different species recognition mechanisms, independently of ecological context. On the other hand, postmating isolation appears to be modulated by the strength of current ecological divergence. This pattern is consistent with the scenario depicted in Figure 5 in which a group of three taxa with permeable reproductive barriers (A, B and C, corresponding in this case to An. gambiae s.s., An. arabiensis and An. coluzzii, respectively) show the degree of reproductive isolation that correlates with ecological divergence, and in which a taxon C is ecologically closer to the most phylogenetically distant taxon B. Under a mode of accumulation of barriers to gene flow that is independent of ecological processes, it is expected that isolation will follow the usual pattern of increase with time since lineage divergence. Conversely, when ecological factors foster the emergence and development of reproductive isolation, ecological speciation theory predicts that the taxa A and C are expected to be less isolated than A and B or B and C (Noor, 1997;Rundle & Nosil, 2005). In accordance with this scenario, we found that postmating barriers are stronger between An. coluzzii and An. gambiae s.s., and An. arabiensis and An. gambiae s.s., and weaker between An. arabiensis and An. coluzzii suggesting that ancestral postmating barriers can increase or be relaxed in response to ecological niche dynamics. These patterns appear to be modulated by the probability of interspecific encounters of the three taxa at a local scale, whereas at a continental scale hybridization seems to occur mostly along narrow contact areas between populations with divergent ecological requirements and geographical histories, demonstrating the unappreciated importance of ecogeographical isolation barriers in this species complex.
Incontrovertible evidence of naturally occurring F1 hybrids between species in the An. gambiae complex was first discovered half a century ago (White, 1971), and low levels of contemporary hybridization continue to be observed in surveys across sub-Saharan Africa, as shown by the systematic review here presented (Table 1, Appendix S2 and Tables S1-S4). Until relatively recently, two factors hindered appreciation of the role of hybridization in this group. First, the influence of Ernst Mayr and Theodosius Dobzhansky perpetuated the prevailing view that hybrids were aberrant, evolutionary dead ends (Coyne & Orr, 2004;Seehausen, 2013).

ACKNOWLEDGEMENTS
This work elaborates data that were gathered in the course of more than 10 years within the framework of several research projects carried out in Burkina Faso and Cameroon, and it would not have been possible without the precious help of many people. We would like to acknowledge the Heads, staff, scientists and field teams of CNRFP