Demographic history and genomics of local adaptation in blue tit populations

Abstract Understanding the genomic processes underlying local adaptation is a central aim of modern evolutionary biology. This task requires identifying footprints of local selection but also estimating spatio‐temporal variations in population demography and variations in recombination rate and in diversity along the genome. Here, we investigated these parameters in blue tit populations inhabiting deciduous versus evergreen forests, and insular versus mainland areas, in the context of a previously described strong phenotypic differentiation. Neighboring population pairs of deciduous and evergreen habitats were weakly genetically differentiated (F ST = 0.003 on average), nevertheless with a statistically significant effect of habitat type on the overall genetic structure. This low differentiation was consistent with the strong and long‐lasting gene flow between populations inferred by demographic modeling. In turn, insular and mainland populations were moderately differentiated (F ST = 0.08 on average), in line with the inference of moderate ancestral migration, followed by isolation since the end of the last glaciation. Effective population sizes were large, yet smaller on the island than on the mainland. Weak and nonparallel footprints of divergent selection between deciduous and evergreen populations were consistent with their high connectivity and the probable polygenic nature of local adaptation in these habitats. In turn, stronger footprints of divergent selection were identified between long isolated insular versus mainland birds and were more often found in regions of low recombination, as expected from theory. Lastly, we identified a genomic inversion on the mainland, spanning 2.8 Mb. These results provide insights into the demographic history and genetic architecture of local adaptation in blue tit populations at multiple geographic scales.

Here, we aimed at investigating various demographic and genomic aspects of the adaptive divergence of a well-studied passerine bird, the Blue tit (Cyanistes caeruleus). Populations of small insectivorous passerines have long been used to study local adaptation, in both quantitative genetics and population genetics frameworks (Broggi, Hohtola, Orell, & Nilsson, 2005;Carbonell, Perez-Tris, & Telleria, 2003;Laaksonen, Sirkiä, & Calhim, 2015). In particular, several blue tit populations breeding in heterogeneous habitats in Southern France (Figure 1a) offer an ideal context to study local adaptation. Four of them (two deciduous and two evergreen, circled in black in Figure 1a) have been subject to a long-term project spanning more than 40 years (Blondel et al., 2006;Charmantier, Doutrelant, Dubuc-Messier, Fargevieille, & Szulkin, 2016). These populations show marked quantitative phenotypic differences (Figure 1b,c), notably in morphological (e.g., tarsus length and body mass), life-history (lay date and clutch size), and behavioral traits (e.g., song characteristics and handling aggression) . These phenotypic differences were found at two spatial scales. First, F I G U R E 1 Map of the sampling locations of blue tit populations on the mainland and in Corsica (a) and phenotypic differences (b). In red, habitats dominated by deciduous oaks and in green, habitats dominated by evergreen oaks. Sites with long-term monitoring are circled in black on the map (A1D corresponds to D-Rouvière in Charmantier et al. 2016, B3E to D-Pirio, B4D to D-Muro and B4E to E-Muro). On figures b & c are presented illustrative phenotypic differences (with standard deviations) between the four main populations with long-term monitoring. Are shown laying date (days), clutch size and male tarsus length (mm) (data from table 1 in Charmantier et al., 2016 Evol Appl, see this reference for detailed information for these measures and for differences in many other traits) birds breeding in deciduous forest habitats are taller, more aggressive, and lay larger and earlier broods than birds in evergreen forests (see table 1 in Charmantier et al., 2016). Strikingly, neighboring populations in deciduous and evergreen habitats are weakly genetically differentiated (Dubuc-Messier et al., 2018;Porlier, Garant, Perret, & Charmantier, 2012;Szulkin, Gagnaire, Bierne, & Charmantier, 2016) despite the short spatial scale, which questioned the mechanisms of persistence of the observed phenotypic differentiation against presumably large gene flow. Second, insular blue tits from Corsica, that might have been isolated since the sea level rise after the last glacial maximum (the sea level raised of 120 m from 17,000 to 5,000 years ago [Jouet et al., 2006;Lambeck & Bard, 2000]), are smaller and more colorful than their mainland counterparts (again, see table 1 in Charmantier et al., 2016) and are listed as a separate subspecies (Cyanistes caeruleus caeruleus on mainland Europe and Cyanistes caeruleus ogliastrae mainly in Corsica and Sardinia). Overall, traits displaying these strong phenotypic differences had heritabilities ranging from 0.20-0.43 (e.g., for lay date, [Caro et al., 2009]) to 0.42-0.60 (e.g., for tarsus length, [Delahaie et al., 2017;Perrier, Delahaie, & Charmantier, 2018;Teplitsky et al., 2014]), are classically related with fitness, and hence could be involved in a local adaptation process. In particular, the heritability of lay date and the breeding time gap between populations could result in reproductive isolation by breeding time, limiting gene flow, and favoring local adaptation. The studied traits were typically quantitative , hence probably controlled by a polygenic architecture  composed of many loci with small individual effects, as found in similar traits for other passerine birds (Bosse et al., 2017;Hansson et al., 2018;Lundregan et al., 2018;Santure et al., 2013). Overall, given their phenotypic, demographic, and genetic characteristics, these blue tit populations provide an ideal framework to study the genomic architecture of polygenic adaptation in heterogeneous environments.
We investigated genome-wide patterns of genetic diversity and differentiation and the demographic history between several populations of blue tits from Southern France, in heterogeneous forest habitats (deciduous vs. evergreen) and in insular (Corsica island) and mainland areas (Mainland France) in order to better understand the determinants of their local adaptation. The analysis was based on Resequencing of birds sampled in four sites studied in the context of a long-term project (Blondel et al., 2006;Charmantier et al., 2016) together with three additional pairs of deciduous and evergreen forests in order to test for parallel evolution (see Figure 1a). First, we investigated variation in genetic diversity and differentiation in order to verify that habitat type and geographic distance explained a significant proportion of the genetic structure between populations . Second, we investigated the historical and contemporary demography of each deciduous and evergreen population pair in order to better understand the origin of their differen-  (Seutin, White, & Boag, 1991).

| Molecular biology and sequencing
DNA extractions were achieved using Qiagen DNeasy Blood & Tissue kits and were randomized across sites. DNA was quantified using first a NanoDrop ND8000 spectrophotometer and then a Qubit 2.0 fluorometer with the DNA HS assay kit (Life Technologies).
DNA quality was examined on agarose gels. Library preparation using RAD-seq (restriction-site-associated DNA sequencing; (Baird et al., 2008)) with the enzyme SbfI was done by Montpellier GenomiX

| Bioinformatics and data filtering
Raw sequences were inspected with FastQC (Andrews, 2010) for quality controls. Potential fragments of Illumina adapters were trimmed with Cutadapt (Martin 2011), allowing for a 10% mismatch in the adapter sequence. Reads were filtered for overall quality, demultiplexed, and trimmed to 85bp using process_radtags, from the Stacks software pipeline 1.39 (Catchen, Hohenlohe, Bassham, Amores, & Cresko, 2013), allowing for one mismatch in the barcode sequence.
Sequencing RAD-tags resulted in a median value of 5,449,564 reads per individual. BWA-MEM 0.7.13 ) was used to map individual sequences against the reference genome of the great tit Parus major (Laine et al., 2016) and to produce sam files using default options. Indeed, although great tit and blue tit diverged about 7-14 Ma (Päckert et al., 2007), the use of a reference genome over a de-novo approach is very often highly recommended (see, e.g., Rochette & Catchen, 2017). It is also interesting to note that synteny is typically highly conserved in birds (Backström et al., 2008), suggesting that aligning reads on a divergent species would still allow analyses based on SNP position along the genome. On average, 93% of the raw reads were mapped against the genome (Table S1).
Samtools 0.1.19 ) was used to build and sort bam files.
We used pstacks to treat bam files, align the reads as assembled loci, and call SNPs in each locus. We used a minimum depth of coverage (m) of 5, the SNP model, and alpha = 0.05 (chi-square significance level required to call a heterozygote or homozygote). cstacks was used to build the catalog of loci, allowing three mismatches between sample loci when building the catalog. sstacks was used to match loci against the catalog. Lastly, populations program in Stacks was used to genotype individuals. In this program, relatively permissive filters were applied, to retain SNP genotyped in at least 50% of individuals (all individuals from all sites grouped) and with a minimum read depth of 4. 350,941 SNP and 947 individuals were obtained in this "GeneralDataset." We hence applied additional filters to this GeneralDataset using the programs VCFtools (Danecek et al., 2011) and Plink (Purcell et al., 2007). We filtered for a minimum average read depth of 10 and a maximum average read depth of 100 (corresponding approximately to the 5%-95% distribution of read depth). SNP genotyped in <80% of the individuals were removed. We removed SNPs with observed heterozygosity ≥0.65 among individuals on Corsica or on mainland to reduce the potential occurrence of stacked paralogues.
Individuals below 85% genotyping rate were removed. Identity-bystate (IBS) was estimated for three replicated individuals (from the DNA extraction to SNP calling) in order to investigate reliability of the entire genotyping process. The IBS measured between replicates was high, ranging from 0.9989 to 0.9999, indicating very low genotyping error rate. These replicates were then removed from the dataset. Using the R packages gdsfmt and SNPRelate (Zheng et al., 2012), we measured the realized genomic relatedness (GRM) between individuals in order to remove highly related individuals (i.e., full-sibs and parent-offspring). For each pair of individuals with relatedness ≥0.35, we removed one individual. This procedure was applied in order to limit biases due to highly related individuals. Indeed, since we sampled breeding birds at their nests for the four sites studied on the long term and that they tend to disperse relatively close to their nests, we expected a higher percentage of related individuals than there actually is in the population. We then limited the number of individuals to 60, chosen at random, in populations (A1D, B3E, B4D, & B4E) in which a large number of individuals were genotyped.
This limitation was achieved in order to limit differences in analyses precision between populations due to unequal sample sizes (all individuals will be used in another ongoing study of the genomic architecture of quantitative adaptive traits). We hence removed potential monomorphic SNPs and created the dataset "FilteredDataset." 454 individuals (Table 1)  Further filtering (e.g., MAF) was often operated, depending on the analyses, and thereafter mentioned if it was the case. To perform analyses of demographic history using ABC analyses, we produced a haplotype VCF file for these 454 aforementioned individuals with the populations module of Stacks and we filtered it as explained in the Note S1.

| Analysis of genetic structure and effects of environmental variables
We used PCAs with the function snpgdsPCA from snprelate to depict genetic structure between individuals. PCAs were run for the en- age Vegan (Oksanen et al., 2007). We investigated the proportion of genetic variability explained by a constraining covariance matrix consisting of phenology, latitude, longitude, and altitude for each individual. We tested the global significance of the model using 1,000 permutations. We ran marginal effects permutations to address the significance of each variable. Then, we focused on the effect of phenology alone, using partial RDA to take into account the effect of latitude, longitude, and altitude. Significance was tested running 1,000 permutations. For the PCA, the admixture analysis and the RDA, we selected from the FilteredDataset the SNPs with more than 95% genotyping rate, MAF >0.05, retaining one SNP per locus and we remove SNP in linkage disequilibrium using the Plink command "indep 50 5 2." Genome-wide differentiation between each sampling location was measured with Weir and Cockerham's F ST estimator (Weir & Cockerham, 1984) implemented in StAMPP (Pembleton, Cogan, & Forster, 2013). F ST was estimated for all the SNPs, the ones on the autosomes and the ones on the sex chromosome Z separately.
Significance was assessed using 1,000 bootstraps replicates.

| Analysis of demographic history
Alternative models of divergence history including the effects of selection at linked sites affecting Ne and of differential introgression We also investigated the historical demography of the populations from Corsica as compared to the ones from the mainland. Gene flow has probably been impossible since the last deglaciation and sea level rise. Therefore, we compared models of ancient migration (AM) and SI. The pipeline described in the above section, integrating linked selection and barriers to gene flow, was run between the A1D samples (chosen on the mainland because it had the largest sample size) and B4D (chosen in Corsica because it had both a large sample size and was from the same habitat as A1D). We used the same ABC pipeline for the model selection procedure and parameter estimation as described in the above section. Finally, we attempted to convert demographic parameter into biological units assuming a mutation rate of 1e-8 mutations/bp/generations.

| Genomic diversity
Genome-wide genetic diversity was inferred for each sampling location in the dataset by measuring observed heterozygosity (Ho), pro- score. Finally, we tested for differences in the distribution of run of homozygosity (ROH) between the mainland and Corsica that may have resulted from smaller Ne and larger inbreeding in Corsica versus the mainland. We used plink 1.9 to estimate the length and number of ROH. We required a window of 500 kb to be homozyguous in order to be considered as a ROH, and with a maximum of 100 SNP.

| Identification of genomic footprints of selection
We used three methods to search for outlier SNPs potentially under divergent selection between blue tit populations. First, we used Bayescan V2.1 (Foll & Gaggiotti, 2008)  We filtered each of the five datasets for a minimum MAF of 0.05.
We used default parameters except for prior odds that were set at 10,000 in order to limit false positives. We investigated the paral-  (2018), such a multivariate method may be more suitable than univariate ones to detect weaker footprints of adaptation that are expected in polygenic adaptations in response to complex environmental heterogeneity. Using a similar procedure as described earlier in the methods, we used two RDAs constrained to investigate the effect of phenology (a) or of the geography (b). We then used a three standard deviation cutoff as suggested by Forester et al. (2018) to list loci with outlier loading scores on the first RDA axes. We compared the loci found using these different methods.
We reported in which genes these outliers were found (the list of genes can be found together with the genome published by Laine et al., 2016 on NCBI).

| Variation of genomic differentiation with recombination rate
We investigated variation of F ST with local recombination rate and whether SNP outliers were more often found in regions of low recombination than elsewhere in the genome. We estimated local recombination rates using a coalescent method implemented in We estimated ρ in a composite dataset with individuals from every population, habitats, and sex. We tested for correlations between SNP F ST and recombination rate using linear models, for Corsicamainland and for deciduous-evergreen differentiation, independently, and we represented the correlation using a LOESS fit. We tested whether outliers found using Bayescan and the RDA method for both Corsica-mainland and deciduous-evergreen differentiation were more often found in regions of low recombination than elsewhere in the genome using chi-square (χ 2 ) tests, and we represented the pattern using histograms.

| Detection of genomic inversions
We searched for potential genomic inversions using a variety of descriptive statistics. First, we searched for genomic regions having a particularly low recombination rate and large long-distance linkage disequilibrium, nevertheless associated with a high density of SNPs and therefore unlikely to correspond to peri-centromeric regions but rather to local suppression of recombination that may be due to inversions. Second, we implemented a PCA sliding window analysis in order to identify portions of the genome with individuals carrying an inversion at the homozygous or heterozygous state or individuals exempt from the inversion (Ma & Amos, 2012). We first used 10Mb windows sliding by 1 Mb and then 1 Mb windows sliding by 100 kb, in order to use enough SNPs to perform PCAs. We then used Lostruct (Li & Ralph, 2019), with k = 2, sliding by 100 SNPs, to identify particular blocks of linked SNPs explaining an abnormally high proportion of variance between two groups of individuals (e.g., inverted and noninverted).
We detected one putative inversion. We verified, using admixture with K = 2 for analyzing SNPs from the inversion, that putative heterozygous individuals had an admixture ratio close to 1:1 of both putative inverted and noninverted homozygous clusters. We then inspected variations of F ST (per SNP) and π (per 10kb window, using vcftools) along the genome between individuals that were inverted homozygous, noninverted homozygous, and heterozygous for the inversion, looking for potentially reduced diversity and increased differentiation at the inversion. We looked for potential salient variations in read depth in and around the putative inversion. We also tested whether the detected inversion was at Hardy-Weinberg equilibrium (using a common χ 2 test) and whether the frequency of the inversion varied geographically and between evergreen and deciduous habitats.
To study the history of the putative inversion identified, we aimed at measuring intra-and interspecific genetic distance and ab-

| Gene ontology
We used the R package topGO (Alexa & Rahnenführer, 2009) to investigate the potential gene ontologies (GO) that were statistically enriched for the sets of genes identified among outliers and the inversion, compared to the entire list of genes in which all the SNPs from the entire dataset were found. For the outlier tests, we used GO analyses for each gene list obtained using the different outlier identification methods but also for aggregated lists of all of the gene identified for deciduous-evergreen tests or for Corsica-mainland tests. We used the GO referenced for the zebra finch, T. Guttata (tguttata_gene_ensembl). We report a Fisher enrichment test pvalue and a p-value after applying a Benjamini and Hochberg correction for multiple testing to control for false discovery rate.

| Genetic structure and effects of environmental variables
The admixture analysis suggested the existence of two main distinct genetic groups corresponding to mainland and island populations

| Demographic history
When deciphering the historical demography of deciduous and evergreen population pairs, the hierarchical model choice procedure strongly rejected models of SI, ancient migration, and of PAN, which were associated with a posterior probability of 0 (Table S3) (Table S3). Across all models, comparisons with heterogeneous gene flow and heterogeneous effective population size were not supported, indicating that, if genetic barriers or linked selection were at play they could not be detected. Demographic parameters were estimated for each pair of populations under the best model (Table S4). Posterior distributions were well differentiated from their prior indicating that estimated parameters were confidently estimated ( Figure S2). Effective population size (N e ) was slightly higher in deciduous than in evergreen habitats (Ne-ABC in Table 2, Figure 3a, Table S4) and tended to be higher on the mainland than on the island. In a number of comparisons, our migration estimates reached the prior upper bounds.
Therefore, we ran a new set of simulations with wider priors (Note S1). Migration rates were not different between deciduous and evergreen habitats: In two instances, the rate of migration was higher from the deciduous to the evergreen, and in two other instances, the reverse was true ( Figure S3).
Deciphering the historical demography between Corsica and the mainland, the ancient migration models (AM) strongly outperformed the SI models, with p(AM) = 0.99 (Table S5) Ne-ABC in Table 2, Table S6). Our analysis also indicated a strong population size change during the process of population isolation,  Table S6).

| Genetic diversity
The mainland populations displayed significantly higher patterns of observed heterozygosity (on average 0.169, Table 2 Figure S5a). LD decayed rapidly in the first 5kb and was lower in populations from the mainland (especially A1D) compared to island populations ( Figure S5b). This pattern of rapid LD decay was similar between chromosomes (see, e.g., chromosomes 1, 2, and Z, Figure S6). Contemporary N e inferred from LD varied from 142 (in A2E) to 355 (in A1D) (Ne-LD in Table 2, Figure S5c).    Figure S10). For the deciduous versus evergreen test, only one of the RDA outliers was also found outlier in Bayescan tests. For the Corsica-mainland test, eight of the RDA outliers were also found outliers in the Bayescan test. Genes in which the outliers were found are reported in the Table S13.

| Variation of genomic differentiation with recombination rate
F ST between mainland and Corsican populations was negatively correlated to recombination rate (Figure 6a, linear model p < 2e-16; Figure S11). In contrast, F ST between each pair of deciduous and evergreen populations was not correlated to recombination rate. F ST outlier SNPs between Corsica and the mainland populations and identified by Bayescan or by the RDA were more often found in regions of low recombination (χ 2 test p-values <.01, Figure 6c) than observed for the entire SNPs (Figure 6b). Average recombination rate was on average twice lower for these F ST outlier SNPs compared to the rest of the SNPs (t test p-values < 1e-6). In contrast, F ST outlier SNPs between deciduous and evergreen populations were not preferentially found in regions of low recombination (Figure 6d).

| Genomic inversions
We detected one putative inversion on chromosome 3, spanning 2.8 Mb, from position 11,838,789 to 14,661,550, and containing F I G U R E 3 Demographic parameters (a) for each of the four pairs of blue tit populations in deciduous and evergreen habitats, estimated using EQ, and (b) for the divergence between populations on the mainland and Corsica, estimated using an ancient migration model. In panel a, circle size is proportional to effective population size (Ne) and arrow width is proportional to migration rate. In panel b, rectangle width is proportional to log10 (effective population size), arrow width is proportional to migration rate, and split time and time of ancient migration are indicated in number of generations

| Gene ontology
None of the gene lists gathered with the different tests (each outlier test among deciduous and evergreen environment and between Corsica and the mainland and the inversion test) yielded any significantly enriched GO term after correction multiple testing (Table   S14). The most promising GO (uncorrected p-value <.0025) in-

| D ISCUSS I ON
In this study, we investigated demographic history and genome-

| Divergence between populations in deciduous versus evergreen habitats
Although we found a significant effect of habitat on genetic structure, the genetic differentiation between neighboring deciduous and evergreen populations was low (F ST ranging from 0.0006 to 0.0079). This result is in line with the primary observations realized earlier on a smaller set of populations (Porlier, Garant, et al., 2012;Szulkin et al., 2016). Accordingly, we found high gene flow from deciduous to evergreen populations. Yet, this quantification of high gene flow and low genetic structure contrasted with the demographic knowledge collected on the Blue tit. Indeed, demographic studies suggested restricted dispersal between these populations, with four dispersal events observed between B4D and B4E (5.6 km apart) and none between the B3E and either B4D or B4E (24.1 km), among a total of 2,788 males, 2,672 females, and 25,158 nestling ringed in the three main Corsican sites between 1976 and 2018, with a mean recruitment rate of 6% (Charmantier, com pers). Our interpretation of this contrast between gene flow estimations gathered from population genomic versus recapture data is quadruple.
First, dispersal estimation on the field using capture-mark-recapture is very challenging and may require more data than currently collected, to detect rare dispersal events, even though these affect population genetic parameters. Moreover, since natal dispersal in the Blue tit classically ranges between 330 m and 4 km (see Ortego, García-Navas, Ferrer, & Sanz, 2011;Tufto, Ringsby, Dhondt, Adriaensen, & Matthysen, 2005), the long-term monitoring sites in Corsica equipped with nest boxes (black circled dots in Figure 1c) are not ideally spaced to identify the origin of immigrants and the destination of emigrants, and only a small fraction of the landscape favorable to blue tit breeding is covered. Second, only a few migrants are sufficient to decrease the genetic distance between populations, measured through the F ST (Cayuela et al., 2018;Marko & Hart, 2011).
In that regard, our results may be compatible with the few dispersal events recorded throughout the years. Third, it is important to note that the number of migrants estimated using demographic analyses represents an average over historical time scales on the order of Ne generations and may have varied widely during contemporary times.
Fourth, the large effective population size found both using coalescence and a LD method might be explained by the existence of large "meta-populations" connected by high gene flow and such large Ne Ne was on average slightly larger in deciduous than in evergreen populations. This could be explained by the higher productivity of deciduous forests compared to evergreen forests, resulting in larger clutches and more fledglings in deciduous habitats (table 1 in Charmantier et al., 2016). However, the very high gene flow between deciduous and evergreen populations and the low genetic differentiation between these populations limits further interpretations. In addition, populations monitored on the long term, for which hundreds of artificial nest boxes have been installed, tended to have larger Ne than nonmonitored populations. Breeding density in the nest box areas studied (Figure 1a) was around 1-1.3 pairs per ha (Blondel et al., 2006), which is most probably 3-5 times higher than natural densities for blue tits when these secondary hole-nesting birds rely on natural cavities only. It is hence possible that the recent We did not find strong footprints of divergent selection between any of the four pairs of deciduous and evergreen populations ( Figure 4). Moreover, the observed outliers were not found repeatedly across the four deciduous-evergreen comparisons but rather each outlier was found only once, and they were not enriched in regions of low recombination. First, this result is in line with the high migration rates observed. High gene flow indeed most likely limits the potential for local adaptation since it limits the accumulation of allelic differentiation, even with high selection coefficients (Lenormand, 2002). Second, this pattern is consistent with the best demographic model being the EQ and not a SC. Indeed, the later would more often create repeated and strong outliers located in regions implicated in reproductive isolation between divergent populations (e.g., Rougemont et al., 2017). Third, this pattern is congruent with a model of local polygenic adaptation involving a transient genetic architecture with multiple alleles of small effects underlying (multiple) quantitative characters (Yeaman, 2015;Yeaman & Whitlock, 2011) and displaying low F ST among loci under divergent selection. Fourth, most documented traits that are involved in the adaptation of blue tits to the deciduous versus evergreen habitat types, such as clutch size or laying date (Blondel, Maistre, Perret, Hurtrez-Boussès, & Lambrechts, 1998;Lambrechts, Blondel, Maistre, & Perret, 1997), are quantitative and most likely rely on a polygenic architecture with alleles of small effects (see, e.g., Santure et al., 2013). However, some statistical issues may limit the interpretation of this result.
First, currently available genome scan methods to detect alleles of small effect are statistically limited, especially when applied to relatively small datasets (Hoban et al., 2016;Rockman, 2012). Second, a potential lack of power in detecting outliers could arise from the combination of the use of RAD sequencing and a rapid LD decay along the genome, that may result in too low resolution especially in regions with larger recombination rates (Lowry et al., 2017). Further analyses of the potential outliers found here, as well as a new analysis with higher marker density and more individuals, are needed to better document and discuss the potential genes and biological processes implicated. Particularly, quantitative genomics will be useful to establish links between the level of divergent selection on these putative targets and their effect size on phenotypic trait variation (Gagnaire & Gaggiotti, 2016;Stinchcombe & Hoekstra, 2007). This would notably reveal whether genes under divergent selection are also responsible for the observed phenotypic variation and contribute to assessing of how much of phenotypic variation is adaptive.

Box 1 Personal (Charles Perrier) reflections on my career and my collaboration with Louis Bernatchez
Last, we searched for inversions potentially associated with phenotypic variation and/or segregating in both habitats. We report multiple evidences ( Figure 6)

| Divergence between blue tit populations in mainland versus insular contexts
Our demographic modeling approach of Corsican and mainland populations revealed that a model with ancient migration was the most probable, which is coherent with the history previously reconstructed for the blue tit complex. Split time of the ancestral populations in two mainland versus island populations connected by gene flow (~3 M years) was relatively coherent with the diversification time found in the literature for the entire blue tit complex (5M [Illera et al., 2011]). Gene flow between the island and the mainland likely stopped around 10,000 years, which is compatible with a gene flow break due to the rise of the flooded stretch between Corsica and the mainland during ice melting and the sea level rise after the last glaciation, during from 17,000 to 5,000 years ago (Jouet et al., 2006;Lambeck & Bard, 2000). Both the signal of population expansion during the diversification period and the gene flow five times larger from the island to the mainland than from the mainland to the island were coherent with a suspected recolonization of the mainland from the islands, as proposed by Illera et al. (2011) based on analyses of nuclear and mitochondrial DNA sequences. The larger effective population size on the mainland than on the island may be due to multiple sources of colonization on the mainland (Taberlet, Meyer, & Bouvet, 1992) and to the much larger mainland area and hence meta-population size compared to the island. Expanding this study with a sampling from the South-East of Europe, one from Sardinia and one from the Iberian Peninsula, could help determine whether the large size estimated for the mainland meta-population studied here might be explained by a mixed recolonization from these refugium (Kvist, Ruokonen, Lumme, & Orell, 1999;Kvist, Viiri, Dias, Rytkönen, & Orell, 2004) and whether insular populations consistently have reduced effective population size. It would also be interesting to explore a model allowing for multiple cycles of isolations and contacts that could have resulted from successive glacial cycles (Hewitt, 2004).
Given the demographic parameters inferred, it is not surprising that the genetic differentiation between populations from the mainland and the island was moderate (F ST = 0.08). However, this differentiation value contrasted with the much larger divergence at mitochondrial DNA observed for these populations (OST = 0.67 between A1D and B4D (Kvist et al., 2004)). This discrepancy between whole nuclear genome and mitochondrial estimates of differentiation could first be explained by the typically four times smaller Ne in mitochondrial DNA compared to nuclear DNA in diploid organisms, resulting in stronger drift and divergence (Smith & Klicka, 2013).
Second, accumulation of cytonuclear incompatibilities could limit its introgression (Burton & Barreto, 2012). Alternatively, gene flow may have been greater for males than females during the colonization process from Corsica to the mainland, at the end of the last glacial period. However, this hypothesis is not supported by the general observation that females disperse on average further away than males and that rare long-distance dispersals are also achieved by females (Tufto et al., 2005). Inspecting mitochondrial DNA variation for the birds studied here would be very useful to compare mitochondrial differentiation in our sample to the previous analysis which also included Corsican birds (Kvist et al., 2004). In any case, one should bear in mind that our results stem from integrating coalescent patterns observed across thousands of loci, therefore providing increased resolution to investigate the determinants of demographic divergence compared to approaches based on fewer mitochondrial and nuclear loci.  (Burri et al., 2015;Spurgin et al., 2019). It is well documented that such increased differentiation in regions with low recombination is not necessarily due to positive selection, or at least not alone, and that it is largely influenced by the effect of recombination in interaction with background selection (Burri et al., 2015;Charlesworth et al., 1993;Perrier & Charmantier, 2019). Moreover, a uniform F ST sliding window, sized in Kb, is expected to dilute signatures of selection in regions of the genome where LD is fast decaying while it is expected to over-represent regions where LD decays much more slowly, hence increasing even more the potential false positives in regions with low recombination (Beissinger, Rosa, Kaeppler, Gianola, & De Leon, 2015;Perrier & Charmantier, 2019). Although perfectible, our sliding window approach using a SNP unit attempted to fix this issue and successfully captured several new outlier regions outside of deserts of recombination. This second window approach should however be improved, for example, by estimating the local neutral enveloped, which computation should integrate local variations in LD, recombination, diversity, but also the demography of populations. As mentioned for the study of adaptive divergence between deciduous and evergreen populations, complementary analyses integrating both genome scans and quantitative genomics would improve our comprehension of the genomic and phenotypic divergence observed between Corsican and mainland blue tits (Gagnaire & Gaggiotti, 2016;Stinchcombe & Hoekstra, 2007).
While the putative genomic inversion on chromosome 3 was absent from Corsica and detected in mainland individuals, its level of divergence from the noninverted sequence indicated that it was likely twice older than the beginning of divergence between blue tit populations from mainland and Corsica. This could first suggest that this polymorphism emerged in mainland blue tit populations and then did not introgress the Corsican populations, maybe due to a local disadvantage and/or genetic incompatibilities or simply due to drift coupled to little gene flow. A second hypothesis could be that this inversion was present in Corsican populations but had been purged out due to a local disadvantage. Such a disadvantage can be due to the typical accumulation of deleterious mutations in such nonrecombining inverted sequences (e.g., Jay et al., 2019).
Lastly, it is possible that this inversion has been acquired in mainland populations only recently, after the last period of contact between mainland and Corsican populations, via gene flow from a distinct refugium with which connectivity could have been enhanced after deglaciation, hence after the stop of gene flow with Corsica. The age of this inversion, its origin, its biological effects, and the potential accumulation of deleterious mutations need to be inferred more thoroughly via genotyping full inversion sequences from individuals in diverse locations and using advanced statistical methods (Lohse, Clarke, Ritchie, & Etges, 2015).

| CON CLUS I ON AND PER S PEC TIVE S
Our study demonstrated the usefulness of demographic modeling and of the analysis of the variation of genomic diversity and recombination along the genome to uncover the genetic determinants of local adaptation in a small passerine with a large distribution, and occupying different forest habitats. Especially, demographic modeling rejected the hypothesis of a secondary contact between deciduous and evergreen populations and favored a situation with continuous gene flow. These results support the idea that blue tits have adapted to their habitats despite ongoing gene flow, while contextualizing how large gene flow most probably constrained local adaptation (Lenormand, 2002) and favored its architecture based on alleles of small effects (Yeaman, 2015). The genomic modeling also refined our knowledge about the divergence between insular and mainland meta-populations, that have been likely unconnected by gene flow for the last ten thousand years. We also verified the relationship between local recombination rate and differentiation, that should probably be integrated in genome scans looking for footprints of selection (Beissinger et al., 2015;Berner & Roesti, 2017;Booker et al., 2020;Burri et al., 2015;Perrier & Charmantier, 2019).
Future investigations will require increased sample sizes and marker density (Hoban et al., 2016;Lotterhos & Whitlock, 2015) in order to better detect loci with small effects that contribute to the quantitative phenotypic variation and local adaptation in blue tits. Lastly, the putative inversion found here would need further analyses since this type of structural variation is often implicated in phenotypic variation (Kirkpatrick, 2006(Kirkpatrick, , 2010Stapley et al., 2017;Wellenreuther et al., 2019).

ACK N OWLED G EM ENTS
We wish to thank Louis Bernatchez for inviting our contribution in this special issue (Box 1). We wish to thank the many people who helped on the blue tit long-term project. We particularly wish to thank Philippe Perret. We also wish to thank Samuel Perret, Christophe De Franceschi, Jacques Blondel, Marta Szulkin, Monica Arias, Patricia Sourouille, Annick Lucas & Boris Delahaie. We thank the associate editor, two reviewers, and the production team for their constructive comments and edits on previous versions of this manuscript.
This project was funded by the European Research Council (starting grant ERC-2013-StG-337365-SHE to AC) and the OSU-OREME.

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R S' CO NTR I B UTI O N S
CP conceived the study, conducted fieldwork, laboratory analyses, bioinformatics and statistical analyses, interpreted the results and wrote the manuscript. QR conducted bioinformatics and statistical analyses, interpreted the results and edited the manuscript. AC obtained the ERC grant, conceived the study, performed fieldwork, interpreted the results and edited the manuscript.