Genomic basis of adaptation to climate and parasite prevalence and the importance of odorant perception in the ant Temnothorax longispinosus

A co‐evolutionary arms race ensues when parasites exhibit exploitative behaviour, which prompts adaptations in their hosts, in turn triggering counter‐adaptations by the parasites. To unravel the genomic basis of this coevolution from the host's perspective, we collected ants of the host species Temnothorax longispinosus, parasitized by the social parasite Temnothorax americanus, from 10 populations in the northeastern United States exhibiting varying levels of parasite prevalence and living under different climatic conditions. We conducted a genome‐wide association study (GWAS) to identify single nucleotide polymorphisms (SNPs) associated with both prevalence and climate. Our investigation highlighted a multitude of candidate SNPs associated with parasite prevalence, particularly in genes responsible for sensory perception of smell including odorant receptor genes. We further focused on population‐specific compositions of cuticular hydrocarbons, a complex trait important for signalling, communication and protection against desiccation. The relative abundances of n‐alkanes were correlated with climate, while there was only a trend between parasite prevalence and the relative abundances of known recognition cues. Furthermore, we identified candidate genes likely involved in the synthesis and recognition of specific hydrocarbons. In addition, we analysed the population‐level gene expression in the antennae, the primary organ for odorant reception, and established a strong correlation with parasite prevalence. Our comprehensive study highlights the intricate genomic patterns forged by the interplay of diverse selection factors and how these are manifested in the expression of various phenotypes.


| INTRODUC TI ON
Parasites often exert significant selection pressure on their hosts, leading to the evolution of counter-adaptations in the host aimed at minimizing the parasites' impact.This dynamic fuels an arms race, resulting in reciprocal co-adaptations in both interaction partners (Dawkins & Krebs, 1979;Tellier et al., 2014).The stage for this coevolutionary arms race is set by the prevailing environmental conditions, which vary in time and space, playing a crucial role in shaping the evolutionary trajectories of sympatric host and parasite populations (Clarke & Fraser, 2004;Gregory, 2009;Oppold et al., 2016;Wisz et al., 2013).Adaptive traits can thus evolve in response to biotic and abiotic selective forces in parallel (Briscoe et al., 2020), and may even favour opposite trait expressions, similar to antagonistic pleiotropy (Garland et al., 2022).With resulting physiological, morphological and/or behavioural adaptations induced by these interacting forces, any analysis that seeks to understand the evolutionary forces acting on a host species through a parasite needs to consider abiotic and other biotic factors alike.
A prime example of a fundamental trait serving both abiotic and biotic factors is the cuticular hydrocarbons (CHCs) of insects.These CHCs primarily safeguard the body of arthropods against desiccation (Berson et al., 2019;Blomquist & Bagnères, 2010;Edney, 2019), with an additional role in communication.Here, they can, for instance, signal the fertility status (Jallon, 1984;Venard & Jallon, 1980) and, in social insect societies, regulate the hierarchy and division of labour (d'Ettorre & Heinze, 2005;Honorio et al., 2019;Leonhardt et al., 2016;Monnin, 1999).The precise function of all CHC compounds remains largely unknown, and we are only in the early stages of comprehending their specific or multiple roles.Two important classes of CHCs are the linear n-alkanes, which tightly layer on insect cuticles as protection against desiccation, and the methyl-branched n-alkanes, mainly serving as recognition cues in the discrimination of nestmates and intruders (Dani et al., 2001;Lorenzi et al., 2011;van Zweden & d'Ettorre, 2010).This dual function of CHCs introduces an intriguing conflict since the compounds required for protection against desiccation may be at odds with those serving in communication, as methyl-branching decreases desiccation protection capacities (Gibbs, 1998(Gibbs, , 2002;;Gibbs & Pomonis, 1995).Depending on the position, adding a methyl-branch to the carbon chain can significantly reduce the melting temperature (Gibbs & Pomonis, 1995), which renders these communication signals more volatile in warmer climates.Consequently, environmental parameters such as temperature were shown to influence the composition of CHC profiles in social insects, indicative of climatic adaptation (Martin & Drijfhout, 2009;Menzel et al., 2018;Sprenger et al., 2019;Wagner et al., 1998Wagner et al., , 2001)).Simultaneously, the organizational structures of social insects hinge on their ability to produce and recognize these nuanced cues, which act like chemical key cards, allowing individuals to access their native colony (Krasnec & Breed, 2013;Martin et al., 2008).Their diverse functions may result in a tug of war for these two classes of CHCs to find an optimum or balance to properly serve both functions sufficiently.In the context of social parasitism, cuticular hydrocarbons play an intriguing role, since social insect parasites break into the fortress of insect societies by overcoming said chemical barrier (Lenoir et al., 2001).During the course of their evolution, these parasites developed and employed various strategies to subvert their hosts' chemical recognition processes and disrupt host defences, for instance, by using glandular secretions as a type of chemical warfare (Allies et al., 1986;Bauer et al., 2009) or by reducing their chemical profile in a strategy known as 'chemical insignificance' (Lenoir et al., 2001).The ultimate goal of the parasite is to coerce host workers into brood care of parasitic offspring and colony maintenance.
The small, acorn-dwelling ant Temnothorax longispinosus is the main host of the dulotic or 'slavemaking' parasite T. americanus (Herbers & Foitzik, 2002), which relies on the host work force to perform virtually all colony tasks (Wesson, 1939).The geographical distribution of the host extends from Quebec, Canada, in the north to Mississippi, USA, in the south and from Nova Scotia in eastern Canada to Minnesota, USA, in the west (Mackay, 2000), and consequently covers a diverse range of climates.Its parasite coinhabits these areas, albeit in higher densities in warmer regions compared to the host (Jongepier et al., 2014(Jongepier et al., , 2015;;Kaur et al., 2019).Parasite prevalence, that is, the relative abundance of social parasites to host colonies, reflects the parasite pressure on the host and, in this system, is higher in the southwestern populations.Along their widespread habitats, the differences in abiotic conditions such as climate could be reflected in various influences on the evolutionary trajectories of the local host populations and consequently the chemical trait expression of CHCs (Blomquist & Bagnères, 2010).
The chemical defences of T. longispinosus against 'enslavement', chemical manipulation and consequent infiltration strategies of the social parasite have been intensively studied.For example, the parasite's ability to be chemically insignificant is based on the loss of a number of methyl-branched n-alkanes (Kleeberg et al., 2017;Lenoir et al., 2001).In fact, parasites from areas with high prevalence carry even lower concentrations of these recognition cues and are less likely to be attacked by their hosts (Kaur et al., 2019).Furthermore, the CHC profile of T. americanus seems to track that of its host populations (Brandt et al., 2005), likely leading to more successful raids and keeping the aggression of exploited host workers towards parasite pupae in check (Achenbach et al., 2010).
As coevolutionary arms races can trigger host diversification (Hamilton, 1982), one of the reported host counter-adaptations is indeed the diversification of chemical profiles in host populations from areas where the parasite is present (Jongepier & Foitzik, 2016;but Kleeberg et al., 2017).With the resulting unique chemical composition, it becomes impossible for the parasite to adapt chemically to all local hosts.Another chemical weapon used by the parasite is the secretion of their Dufour's gland, manipulating its hosts during raids resulting in defenders attacking nestmates rather than intruders (Brandt & Foitzik, 2004).Host populations under strong parasite pressure have thus counterevolved a lower perception of these chemical manipulation tactics (Jongepier et al., 2015), and at the same time often increase their levels of aggression after a previous encounter with a social parasite (Pamminger et al., 2011;Scharf et al., 2011) as an a posteriori defence for future attacks.'Flight' can also be a successful behavioural strategy in response to social parasites, and the 'fight versus flight' defence portfolio has been observed to shift with parasite prevalence in the field (Hermann, 1984;Hölldobler & Wilson, 1990;Jongepier et al., 2014).Similarly, in areas with low to high parasite prevalence, there is a shift from mono-to polygyny, and an increase in colony size (Foitzik et al., 2009;Foitzik & Herbers, 2001), reducing preferability to the parasite.Although these chemical, behavioural and social-organizational shifts with parasite prevalence are well documented, there are no studies yet to show the extent to which they have a genetic basis or are in fact largely plastic.
In this study, we aimed to identify signals of selection in the genome of the host ant T. longispinosus by two selective forces -climate and parasite prevalence -using a whole-genome sequencing approach on a population level.We further investigated potential links between the population-level expression of two phenotypes -the chemical profile involved in host signalling, and antennal gene expression as an indicator of perception capabilities -to the genome and the environment.We collected 100 independent T. longispinosus colonies each from 10 different communities in the northeastern USA, including previously studied and new sites, varying in their parasite prevalence and local climate.Our approach included (i) a genome-wide association study (GWAS) to identify single nucleotide polymorphisms (SNPs) associated with either local parasite prevalence or climate, (ii) an investigation of population-specific differences in chemical profiles, (iii) a population-level antennal gene expression study to investigate whether transcriptional activity shifts along the parasite prevalence gradient and (iv) a GWAS to identify loci associated with the abundance of specific CHC compounds.Since the two analysed phenotypes were shown to have a significant level of plasticity depending on various (a-)biotic factors (Feldmeyer et al., 2016;Menzel et al., 2018;Segev & Foitzik, 2019), all experimental colonies were kept under controlled standard laboratory conditions to generate a phenotype expression free from environment-induced variability.Our goal was to uncover the genomic footprints associated with adaptations observed and described in prior studies, and put forward potential candidate genes for exploration in future studies.

| Ant collection and maintenance
Our focal species, Temnothorax longispinosus, has black, small, 2-to 3-mm-long workers in facultatively polygynous colonies of several dozen to hundred workers.It is widespread in deciduous forests in the northeastern USA and southeastern Canada.The colonies inhabit crevices, sticks, acorns or hickory nuts on the forest floor (Foitzik et al., 2009).Host colonies were collected from July to September 2021 across 10 different locales in the northeastern USA, with minimum distances of 100-200 km between collection sites (Figure 1a, Tables S1 and S2).The collection sites were chosen based on previous sampling efforts to ensure long-term information on parasite prevalence and host colony traits (Kaur et al., 2019), with additional sites being selected based on geographical location, forest composition and focal tree densities, namely oak and hickory (U.S. Forest Service, 2021).Per site, collection was further divided into several sub-sites within a 5 × 5 km grid (2-10 subsites per population; Table S2) to capture local genetic diversity.Colonies were mainly collected in hollow acorns, less frequently in hickory nuts or rock crevices and then moved into plastic bags, fed and stored at 4°C before being transported to Mainz, Germany.There, they were placed in three-chambered nest boxes with an artificial nest site consisting of acrylic glass (3 mm high) with an entrance embedded between two microscope slides (7.5 × 2.5 × 0.5 cm), darkened with red foil.As both, transcriptional activity and the cuticular hydrocarbon profile, can be highly dependent on environmental factors in ants (Feldmeyer et al., 2016;Menzel et al., 2018;Segev & Foitzik, 2019), natural confounding factors were reduced by standardizing temperature to 18°C, controlling humidity (80%) and light cycles (12 h:12 h dark:light) for all colonies.Ants were fed once per week, following the 'Bhatkar diet' (Bhatkar & Whitcomb, 1970) with water given ad libitum, until dissections in mid-November 2021.

| Estimating parasite prevalence and climate
For a strong estimation of parasite prevalence, that is the percentage of parasitic colonies in the whole Temnothorax community (consisting of T. longispinosus and T. americanus), we chose previously studied sampling sites (Brandt & Foitzik, 2004;Foitzik et al., 2009;Herbers & Foitzik, 2002;Jongepier et al., 2014) ensuring stable longterm data since social parasites are irregularly distributed (Foitzik & Herbers, 2001;Herbers & Foitzik, 2002).Temnothorax ants have a long generation time of 5-15 years, and their large population sizes suggest relatively stable parasite prevalence over time.Since we focused on T. longispinosus colonies and the impact of local parasite pressure, we excluded parasitized colonies with only T. curvispinosus host workers (co-occurring with the focal host species in Massachusetts, New York South, Ohio and West Virginia), a less preferred host species (Brandt & Foitzik, 2004) from our calculations.
Mixed colonies with both exploited T. longispinosus and T. curvispinosus workers were weighted based on the ratio of workers from each host species (for instance, a 1:1 ratio of both hosts in a parasitic nest is counted as 0.5 instead of 1).
To investigate local adaptation to the climate, we retrieved climate data from the CHELSA bioclim database (1981-2010, v.2.1;Data S1 'Climate_data'), which consisted of 10 temperature and 8 precipitation values.PCA plots were created using fviz_pca_biplot() from the FactoMineR package (Lê et al., 2008) in R v.4.2.2 (R Core Team 2023).Using a probabilistic broken stick model, Principal Component 1 (PC1) was identified to explain more variances than expected by chance.Thus, PC1 climate eigenvalues were used for our genome-environment association analysis.

| Sample preparation and DNA/RNA/CHC extractions
To investigate local adaptation on population level, we took two workers from 90 to 100 independent, predominantly single-queen colonies per population (100 in all populations except in New Hampshire with 90, Maryland with 99 and West Virginia with 97 colonies).From each colony, the first two workers that ran to the entrance upon nest disturbance were removed and anaesthetized on ice.This procedure allowed to collect workers with similar behaviour and transcriptional activity in a standardized way (Kohlmeier et al., 2017(Kohlmeier et al., , 2018(Kohlmeier et al., , 2019)).From one ant, two legs were dissected for the population genomics analysis (PoolSeq) and transferred to an empty tube, and both antennae were dissected for a pooled gene

| PoolSeq analysis
Whole-genome sequencing data were trimmed using Trimmomatic v.0.39 (Bolger et al., 2014) with a 10 bp head crop and quality-checked with FastQC (Andrews, 2010).Overlapping reads in pooled wholegenome sequencing data were identified using pear v.0.9.11 (Zhang et al., 2014) and separately mapped and later-on merged to the nonoverlapping reads.Mapping of trimmed reads was performed using BWA mem (Li & Durbin, 2009) with a minimum seed length (k) of 30 and using an unpublished reference genome of T. longispinosus (The GAGA Consortium, 2017), which resulted in a mean mapping rate of 95.6% (±1.0%).Samtools v. 1.10 (Li et al., 2009) flagstats was used to determine mapping rate and average coverage.We used Picard v.2.20.8 (Broad Institute, 2018) markDuplicates to mark and remove duplicate sequences.We then sorted and converted to bam using samtools and removed mappings with low quality using samtools view (-q 20 -f 0x0002 -F 0x0004 -F 0x0008).Files were combined using samtools mpileup to a single file and then converted to a sync file using the mpileup2sync.jarscript provided by PoPoolation2 v.
1.201 (Kofler, Pandey, et al., 2011).We determined linkage disequilibrium using LDx (Feder et al., 2012), which showed convergence at around 300-400 bp at an R 2 of .3 in all populations, meaning that SNPs within this range ought to be linked (Figure S1a).Thus, for further analyses, a window and step size of 1 kB were used.We created individual pileup files using samtools mpileup, and identified and removed indels using PoPoolation1 (Kofler, Orozco-terWengel, et al., 2011) identifygenomic-indel-regions.pl and filterpileupby-gtf.plrespectively.SNPs with quality <20 were filtered using subsample-pileup.pl.To investigate genetic diversity within a population, we used variance-sliding.pland calculated Tajima's π for nucleotide diversity and Watterson's estimator θ for genetic diversity with window and step size of 1 kB.Plots were generated using the boxplot() function in R. To investigate interpopulational differences, we piled up all populations and the reference files again using samtools mpileup, converted to a sync file and identified and removed indels as previously described.We estimated population divergence as F ST , and their corresponding FET values using fst-sliding.pland fisher-test.plfrom PoPoolation2 respectively.
We compared allele frequencies between populations with extreme differences in conditions using the Cochran-Mantel-Haenszel (CMH) test statistics, as we did not expect linearity in loci associated with either of our factors and rather assumed a multigenic trait expression.We conducted CMH tests between the two highest-ranking populations (West Virginia and Pennsylvania) with the lowest-ranking ones (Maine and New Hampshire) in regard to parasite prevalence and irrespective of other factors like local climate (Ohio was excluded due to its unique patterns in allele frequencies; Figure S2a).We used PoPoolation2's cmh-test.pland cmh2gwas.pl,setting the significance level at p ≤ .0001.To account for putative biases from the choice of pairs, analysis was performed twice with reciprocal pairs.With 92% overlaps in SNPs in both runs, non-overlapping SNPs were excluded from further analyses.
For PC1 climate eigenvalues, we selected population pairs with the greatest pairwise climate differences, comparing Massachusetts versus Maine, and Vermont versus New York South, with a 95% overlap in both reciprocal runs.Our findings revealed a strong correlation between PC1 climatic eigenvalues (henceforth PC1 climate), explaining 42.1% of the variance and parasite prevalence (Pearson's cor.: r = −.78,p = .008;Figure S2b).We, therefore, decided to overlap the outliers associated with parasite prevalence and climate to obtain parasite prevalence-exclusive SNPs, CMH para , and climateexclusive SNPs, CMH clim , while discarding overlapping SNPs associated with both conditions.Significant SNPs were categorized as within genes, exons, introns, 2 kB around of genes or outside of genes using bedtools intersect (Quinlan & Hall, 2010).SNPs resulting in amino acid changes were identified using CROXA/tbg-tools v.2.0 (Schönnenbeck et al., 2021) and individually investigated using igvtools v.11.0.2.(Robinson et al., 2011).

| PoolRNASeq analysis
Trimming and quality checks of PoolRNASeq reads were performed as described above.Mapping of reads was done using HISAT2 v.2.1.0(Kim et al., 2015) with the aforementioned reference genome, which resulted in a mean mapping rate of 93.7% (±1.1%).Using the htseqcount function from HTSeq v.2.0.2 (Putri et al., 2022), we created a transcript read count table, which was used as input for DESeq2 (Love et al., 2014).Genes with less than 10 counts in at least five samples were removed.We tested for differentially antennal expressed genes (AEGs) using parasite prevalence as a continuous variable.Transcripts were variance stabilized before performing a PCA, for which the broken stick model showed PC1 to explain more variances than expected by chance.

| Extraction of chemical profile
The CHC extracts were analysed using a gas chromatograph (7890A, Agilent Technologies, Santa Clara, CA, USA) equipped with a Zebron Inferno ZB5-HT column (Phenomenex Ltd., Aschaffenburg, Germany).Temperature was kept at 60°C for 2 min, then increased by 60°C/min up to 200°C and subsequently by 48°C/min to a maximum of 320°C and was kept constant for 10 min.Helium was used as carrier gas at a flow rate of 1.2 mL/min.Hydrocarbon molecules were then transferred to a mass selective detector (5975 C, Agilent Technologies) and fragmented with an ionization voltage of 70 eV.Fragments in the range of 40-550 m/z were detected and subsequently used for substance identification based on the retention times and diagnostic ions (Carlson et al., 1998).We quantified substances using a total ion count from the chromatogram peaks that we manually integrated using MSD ChemStation (E.02.02.1431,Agilent Technologies).Relative proportions were calculated in Microsoft Excel from which we filtered out (i) non-hydrocarbon substances, (ii) substances at the end of the profile that could not be reliably integrated and (iii) substances from which the maximum value for a sample stood below 0.5% of the total profile.The overall profiles were then adjusted to 100%.We furthermore identified substances based on the ion mass spectrum from each peak, and classified them as nalkanes or methyl-branched recognition cues previously identified (Jongepier & Foitzik, 2016).A non-metric multidimensional scaling was used on relative abundances to establish population coordinates for all identified compounds using the metaMDS() function from the vegan package (Oksanen, 2022) in R.

| CHC association study
We performed a 'phenotype' genome-wide association study (GWAS) using the BayPass v.2.2 (Gautier, 2015) standard covariant model, for which we assumed linearity in CHC trait expression and aimed to determine SNPs in genes for recognition cues known to be associated with aggression from a previous study on this species (Jongepier & Foitzik, 2016).The goal was to identify genes that might play a role in the synthesis of those CHCs, for which we have experimental evidence that they are used in recognition and aggressive behaviour.For this, we uniformed coverage to a minimum of 20 using subsample-synchronized.pl and then calculated major and minor allele frequencies by using snpfrequency-diff.plfrom the PoPoolation2 toolkit.The output was transformed to BayPass format for pooled samples.SNPs with a Bayes Factor ≥20 were deemed significant.Similar to in the CMH analysis, significant SNPs associated with any recognition cue, henceforth BayPass CHC , were categorized as within genes, exons, introns, 2 kB around of genes and outside of genes.SNPs resulting in amino acid changes were identified using CROXA/tbg-tools (Schönnenbeck et al., 2021).

| Annotation and GO enrichment analysis
For all available genes found in the T. longispinosus genome, genes were annotated by running a BlastP (Altschul et al., 1990) search against the non-redundant nucleotide invertebrate database (Jan 2023).To further characterize gene functions, we ran InterPro (Paysan-Lafosse et al., 2023) to retrieve Gene Ontology (GO) information, which was used to perform an enrichment analysis using topGO (Alexa & Rahnenfuhrer, 2017) for genes that contain CMH para , CMH clim and BayPass CHC SNPs, as well as AEGs.For candidate genes, we searched for similarly named genes/proteins in UniprotKB (The UniProt Consortium, 2023) for model organisms such as Drosophila melanogaster to obtain their putative function.
Candidate genes were characterized as genes containing nonsynonymous SNPs identified in the GWAS, and/or had overlaps with candidate genes from previous studies.

| Population structure and link between environmental traits
Intrapopulation nucleotide (Tajima's π) and genetic diversity (Watterson's θ; min = 0.0, max = 0.06, mean = 0.003 for both parameters) were generally low (Figure S1b,c, respectively).Allele frequency differences F ST were low among all populations (min = 0.03, max = 0.049, mean = 0.04; Figure S1d).However, a weak trend of isolation by distance was detected (Mantel-test: r = .264,p = .064;Figure 1c).Similarly, a principal component analysis (PCA) based on allele frequencies at all SNP loci indicated PC2, explaining 12.3% of variations, to be linked to the geographical distances between populations (Mantel test: r = .30,p = .049;not depicted).Furthermore, PC1, explaining 14.2% of observed variation, showed the population of Ohio to be an outlier in comparison to the other populations (Figure S2a).
Our extensive sampling of 10 populations across the species range revealed strong differences in parasite prevalence ranging from 2.3% to 15.6% (Figure 1a; Table S1).Moreover, we detected a strong correlation between parasite prevalence and PC1 of climate data (Pearson's cor.: r = −.78,p = .008;Figure S2b; see Section 3.3), which coincides with the social parasite T. americanus occurring at higher relative frequencies in warmer climates.This interaction between parasite prevalence and climate makes it difficult to identify SNPs in the host that are associated with either of the two intertwined environmental parameters.By selecting and contrasting populations in two separate analyses that differed more strongly in either parasite prevalence or climate, and discarding their overlaps, we tried to separate the two factors as much as possible.

| Parasite prevalence associated SNPs and functional properties of candidate genes
Using CMH test statistics, we contrasted allele frequencies of two populations with the highest and the lowest parasite prevalence (West Virginia and Pennsylvania vs. Maine and New Hampshire, excluding Ohio due to its unique allele frequency pattern; Figure S2a).This resulted in 1614 SNPs significantly associated with parasite prevalence (p ≤ .0001),henceforth CMH para SNPs, located in 1076 different 1 kB windows (Figure S3a).In total, 754 SNPs were located within introns (46.7%), 90 SNPs were located within exons (5.6%), 208 within 2 kB around a gene (12.9%) and the remaining 562 SNPs outside of genes (34.8%).In summary, 52.3% of all CMH para SNPs were within 288 unique genes, of which 263 were annotated (Data S2 'CMHpara'; Figure 2).
A GO-enrichment analysis for all CMH para candidate genes revealed biological functions such as 'peptidoglycan catabolic process' and 'sensory perception of smell' as two of the eight significantly enriched GO terms (p ≤ .05; Figure 2; Data S7 'topGO_analysis').
For further analyses, we focused on CMH para SNPs within exon regions that resulted in a non-synonymous substitution, with a special emphasis on those leading to changes in biochemical properties of the translated amino acid, also known as a radical substitution (e.g.amino acid with negatively charged side chain changes to one with a hydrophobic side chain).This resulted in 15 candidate genes of which eight contained such radical substitutions (Table 1).Of these, three could be identified as odorant receptor gene 43a-, 43a-like and a nonannotated one, as well as gustatory receptor gene 28a.Furthermore, the gene probable cytochrome P450 6a20 contained an SNP resulting in a nonsense mutation.

| Climate-associated SNPs and functional properties of candidate genes
A PCA using climate variables retrieved from the CHELSA bioclim database revealed PC1 to explain 42.1% of observed variance in temperature-and precipitation-related values, while PC2 explained 23% of the observed variance (Figure 1b; Data S1 'Climate_data').
The climatic landscape varied with geographical distance between populations (Mantel test: r = .36,p = .03;not depicted; raw data Figure S1d).Within the investigated range, host populations in the southwest are exposed to warmer and drier climates (e.g.Ohio and West Virginia; equivalent to lower PC1 climate eigenvalues) versus colder and more humid climates in the northeast (e.g.Maine or New Hampshire; equivalent to higher PC1 climate eigenvalues; Figure 1a, indicated by yellow-to-purple gradient).
Due to the aforementioned correlation between climate and parasite prevalence, we contrasted pairs of populations with the highest difference in their PC1 climate eigenvalues as a proxy for the largest difference in climatic conditions (ergo Maine and Vermont vs. Massachusetts and New York South, excluding Ohio), while avoiding contrasting the same pairs used in the CMH para analysis.This resulted in 3024 climate-associated SNPs, henceforth called CMH clim SNPs, located in 1998 different 1 kB (Figure S3b).Of these, 133 SNPs were located within exons (4.4%), 1492 within introns (49.3%), 287 within 2 kB around a gene (9.5%) and the remaining 1114 SNPs outside of genes (36.8%).In summary, 53.7% of all CMH clim SNPs were located within 421 unique genes, of which 407 were annotated (Data S3 'CMHclim'; Figure 2).Genes with the greatest number of SNPs within exons were in protein mesh isoform X1 with a total of 10 SNPs within exon region and 25 within intron region.

| Population-level chemical profiles and genome-wide associations of recognition cues
We identified 54 cuticular hydrocarbons (CHCs), of which 9 were previously identified as important recognition cues in T. longispinosus (Jongepier & Foitzik, 2016), and 6 as linear n-alkanes important for protection against desiccation in our pools of whole individuals (Data S6 'CHC_profiles').We used a non-metric multidimensional scaling of all 54 CHCs to visualize population-level differences (Figure S4a) and established a positive correlation of MDS1 to PC1 climate (Pearson's cor.: r = .76,p = .01;Figure S4b).
We split the dataset into two CHC classes, the six identified linear n-alkanes, henceforth n-alkanes, and the nine focal recognition cues.We detected a positive correlation between the relative abundances of n-alkanes and PC1 climate, with relatively more n-alkanes found in the CHC profiles in colder, more humid habitats (Pearson's cor.: r = .78,p = .008;Figure 3a), while we detected a negative correlation with parasite prevalence (Pearson's cor.: r = −.66,p = .039;Figure S5a).We identified non-significant trends of the relative abundances of all recognition cues increasing with parasite prevalence (Pearson's cor.: r = .58,p = .08;Figure 3b) and decreasing with PC1 climate (Pearson's cor.: r = −.57,p = .088;Figure S5b).We further tested for correlations between each individual recognition cue against parasite prevalence but found no significant correlation after FDR correction.Instead, we identified recognition cue 11-15-DiMeC31 to be significantly correlated with climate (Pearson's cor.: r = −.91,p = .002,Figure S6).
Four of the genes containing associated SNPs contain nonsynonymous substitutions, with three of the genes being annotated TA B L E 1 Summary of candidate genes containing non-synonymous SNPs associated with parasite prevalence identified by a GWAS using Cochran-Mantel-Haenszel test statistics, contrasting major allele frequencies of two populations with highest (PA and WV) versus the lowest (ME and NH; Table S1) parasite prevalence.Additionally, the average percentage of occurrence of the alternate variant is given for both high and low parasite prevalence populations in comparison to the reference genome.b Radical substitution (change in amino acid biochemical properties).

| Gene expression patterns of antennae pools across populations
A PCA of pooled antennal transcriptome data revealed that PC1 explained 61% of observed variation (Figure S7a), and is negatively correlated with parasite prevalence (Pearson's cor.: r = −.85,p = .002; Figure 4a), and weakly, but non-significantly correlated with climate (Pearson's cor.: r = .63,p = .051;Figure S7b).A DESeq2 analysis of pooled antennal transcript data using parasite prevalence as a continuous variable revealed 2569 significantly differentially expressed transcripts (henceforth referred to as antennal expressed genes (AEGs), p adjust ≤ .05;Data S5 'AEGs').Among the top10 log-2 foldChange (log 2 FC) of significant AEGs, we identified two genes directly (GSK-3-binding protein and histamine H2 receptor) or indirectly (PHD finger protein rhinoceros) involved in post-translational modifications (PTM).Furthermore, we found 33 genes involved in The main objective of this study was to assess genomic footprints of selection on the host species T. longispinosus potentially exerted by the social parasite T. americanus and the local environmental conditions using climate as a proxy.Our approach involved a genome-wide association study (GWAS) with pools of host ants sampled from 100 independent colonies at 10 different sites in the northeastern USA.
These 10 sites were characterized by diverse climatic conditions and different levels of parasite prevalence, that is, the percentage of parasitized colonies in a population, and both were considered crucial environmental factors.We also investigated the interplay of two phenotypic traits critically related to these factors, (i) the cuticular hydrocarbon profile as an indicator of signalling and (ii) genes expressed in the antennae as an indicator of cue recognition.We discovered an as-of-yet unknown correlation between local climate and parasite prevalence, further highlighting the complexity and probable limitations of mechanisms driving host-parasite co-evolution in regard to their shared environment.We demonstrate that both ecological parameters were not only associated with genomic signatures of selection but also linked to the expression of both chemical signalling and cue recognition traits.

| Weak population structure indicating gene flow
Despite significant differences in local climate and the extensive sampling area, our study indicates weak population differentiation in T. longispinosus.This aligns with a previous microsatellite study, reporting similarly low differentiation in allele frequencies among 14 populations, half of which half overlapped with populations from this study (Pennings et al., 2011).The same study also found no patterns of isolation by distance, while we identified a non-significant trend (Figure 1c).Low population differentiation has also been reported in other ant species such as Formica fusca, Crematogaster levior, Cardiocondyla obscurior and Temnothorax nigriceps, attributed to their high dispersal rates (Cordonnier et al., 2022;Errbii et al., 2021;Hartke et al., 2021;Johansson et al., 2018).Similarly, the weak population structure in T. longispinosus may be explained by a recent expansion and/or high dispersal rates in this species.
The large population size of T. longispinosus (Brandt & Foitzik, 2004;Herbers & Foitzik, 2002;Pennings et al., 2011) prevents genetic drift, and both males and queens conduct large-scale nuptial flights with males in particular contributing strongly to gene flow (Pennings et al., 2011), thus we suggest high gene flow among populations as the main factor driving the observed low population differentiation.

| Genomic associations to parasite prevalence and climate
Climate parameters used for this study were more stable and accurately measured in comparison to parasite prevalence, which was estimated on the basis of single or multiple collections per population over several years.In this study, we identified a close correlation between both factors, local climate and parasite prevalence, for which we hypothesize that their interplay ultimately leads to the geographical variation in phenotypic trait expression, as already shown (Jongepier et al., 2014(Jongepier et al., , 2015;;Jongepier & Foitzik, 2016), and further demonstrated here.While we tried to disentangle both factors in our analysis as much as possible, it should be noted that complete separation of their effects may not be achievable due to their close interconnectedness.
Our GWAS identified half the number of SNPs associated with parasite prevalence in comparison to climate, and accordingly half the number of candidate genes.We found eight enriched biological functions in these candidate genes associated with parasite prevalence, for example, 'sensory perception of smell'.A successful first line of defence for the host is the proper recognition of intruders, followed by an appropriate behavioural response, providing the greatest benefit for host colonies (Jongepier et al., 2014).We therefore suspected high selection pressure on recognition and signalling capabilities, and anticipated SNPs in genes associated with parasite prevalence relating to chemical recognition, such as odorant and/ or gustatory receptor genes, as well as genes associated with the synthesis of CHCs, such as fatty-acid-synthase (FAS) genes (Chung et al., 2014;summarized in Holze et al., 2021;Pei et al., 2019;Wicker-Thomas et al., 2015).Indeed, we identified non-synonymous SNPs occurring exclusively in highly parasitized populations within three odorant receptor (OR) genes (TlonOR210N, -27 and -279) and one gustatory receptor gene (TlonGR91).Changes in odour perception due to specific SNPs within OR genes have previously been observed in Drosophila melanogaster (Wang et al., 2010), in which some SNPs were further associated with behavioural shifts due to reduced olfactory abilities (Rollmann et al., 2010).These variants in the OR genes might lead to a diversity in recognition abilities along the parasite prevalence (and climate) gradient.Additionally, we identified non-synonymous SNPs associated with parasite prevalence within fatty-acid synthase-like and catalase genes.Fatty acid synthases (FAS) are involved in the synthesis of CHCs and have been linked to the protection against desiccation in insects (Holze et al., 2021).Known for its role in combating reactive oxygen species (Nandi et al., 2019), catalase might be linked to changes in host stress levels along the parasite prevalence gradient.We further unveiled distinctive variants within two Cytochrome P450 genes, 6a20 and 6k1, with the latter one even containing a nonsense mutation, which might greatly impact the functionality of its protein.We also discovered variants within the gene trichohyalin-like associated with parasite prevalence.
This gene has previously been associated with cuticle-related functions in D. melanogaster and in the social bumblebee Bombus terrestris (Feng et al., 2022;Liu et al., 2022).While it remains speculative, it is plausible that this candidate gene could alter the cuticle of hosts, that is, lead to a harder cuticle with better defence abilities, favourable in highly parasitized areas.
For our analysis of the effect of climate, we identified enriched GO functions associated with catabolic, metabolic, transport and signalling-related functions for climate-associated candidate genes, which could be reflective of adjusting, managing and responding to external climate challenges.As for SNPs within genes associated with climate, we identified two genes, protein mesh and neurobeachin, containing large numbers of SNPs.Protein mesh had been previously linked to the proper development of Malpighian tubes in D. melanogaster (Jonusaite et al., 2020), an organ important for homeostasis of ions and water, therefore important for cold resistance in this insect model (Andersen et al., 2017).Similarly, neurobeachin was found to be important in heat stress response in mammals (Howard et al., 2014;Igoshin et al., 2019).

| Population-level cuticular hydrocarbon profiles and their genome associations
Cuticular hydrocarbons play vital roles in social insects, serving both as waterproofing agents and as recognition cues in the intra-and interspecific biotic exchange of information (Sprenger & Menzel, 2020).However, it remains elusive if specific compounds fulfil specific tasks, or whether they are multifunctional.Several studies indicate that methyl-branched n-alkanes often function as behavioural and recognition cues (Awater-Salendo et al., 2020;Ginzel et al., 2003;Guédot et al., 2009;Lacey et al., 2008;Ruther et al., 2011;Sakata et al., 2017;Silk et al., 2011).These methylbranched n-alkanes are likely influenced by climate due to their increased volatility in warmer temperatures as a result of their lowered melting points (Gibbs & Pomonis, 1995).Given the significance of accurate recognition, particularly in the context of social parasitism, it is reasonable to infer that the abundance of methyl-branched recognition cues could be influenced and perhaps constrained by the local climate.This, in turn, could lead to a trade-off between climate adaptations and proper parasite recognition abilities.
We hypothesized that chemical profiles will be dependent on both abiotic and biotic factors and will likely show patterns shaped by local climate and parasite prevalence alike.In our populationlevel CHC analysis, we found such a correlation between the proportion of linear versus methyl-branched alkanes with climate but not parasite prevalence.Methyl-branched alkanes are commonly used for recognition purposes, while n-alkanes may be particularly good for protection against desiccation due to their linear structure (Dani et al., 2001;Lorenzi et al., 2011;van Zweden & d'Ettorre, 2010).The relative abundance of n-alkanes increased from populations in warm/dry environments to those in cold/wet regions, which directly contrasts findings from T. longispinosus temperature acclimation experiments in which n-alkane abundance increased in high temperatures compared to moderate ones, showcasing quick plastic responses in the chemical profile to drastic temperature changes (Menzel et al., 2018).As colonies from all populations were maintained under the same stable laboratory conditions for several months before measuring CHC profiles, the observed differences between warmer and colder adapted populations likely stem from local adaptation to the original habitat rather than plastic responses to the standard laboratory conditions.
While we could not detect a correlation between parasite prevalence and the sum of all recognition cues, we observed a significant correlation between the relative abundance of the single recognition cue 11,15-;13,17-DiMeC31 and climate.This particular cue, previously linked to desiccation resistance in aphids (Yang et al., 2022), explained a large portion of the aggressive behaviour in our study species towards intruders (referred to as 11,15-DiMeC31 in the study, although it is a mixture with 13,17-DiMeC31, Jongepier & Foitzik, 2016).
With our CHC-GWAS, we identified associated SNPs within six different odorant receptor genes (OR) and two FAS-like genes.
Unlike FAS-like genes that may directly be linked with the synthesis of specific compounds, ORs are not involved in CHC biosynthesis, but are important for perception of smell and might thus naturally appear among the associated candidate genes.Notably, the majority of significant SNPs in this analysis were generated by only five of the nine recognition cues, including the previously mentioned cue 11,15-DiMeC31.Consequently, this cue met several criteria: (i) It accounted for the majority of the host's aggressive behaviour towards the parasite as shown in behavioural assays (Jongepier & Foitzik, 2016), (ii) its abundance, even under laboratory conditions, was correlated with local environmental factors, and (iii) we were able to identify particular candidate genes associated with it.
These candidate genes now warrant closer investigation in future studies.

| Antennal expressed genes and their correlation with parasite prevalence
Exposure to a social parasite has been shown to induce differential gene expression in the antennae (Stoldt et al., 2023), and more risk-taking tasks such as foraging are accompanied by an increase in antennal gene expression, even more so than brain gene expression (Caminer et al., 2023), showcasing the importance of this peripheral organ in enemy recognition.Variation in gene expression, especially in olfactory receptor genes, could potentially influence the ability to discriminate between nestmates and enemies in social insects.
However, perception also depends on the stability of cues on the cuticle, influencing their evaporation rate, further influenced by the local climate.
Our analysis of population-specific patterns of transcriptional activity in host antennae showed a strong correlation of differential transcriptional expression to the local parasite prevalence, and a considerably weaker one to climate.This strong effect of parasite prevalence on gene activity in the antennae (henceforth antennal expressed genes, AEGs) could theoretically be attributed to plasticity of gene expression in response to parasite presence, which is questionable, as on average only half of natural host colonies ever encounter a parasite, even in highly parasitized populations (Foitzik & Herbers, 2001).Our AEG dataset was also generated without the colonies being experimentally exposed to social parasites, and some workers might have even hatched in captivity.This suggests that the population-specific differences in transcript expression are probably due to local adaptation rather than plasticity, and/or that local adaptation allows for a wider palette of possible phenotypes.To be able to better detect a parasite, one could imagine hosts to increase the transcriptional activity of OR genes.However, we did not detect differential expression of odorant nor gustatory receptor genes but instead showed an enrichment of biological functions relating to post-translational modifications.Another gene expression study in brains of the same species investigating the effect of parasites on hosts revealed a similar overrepresentation of two GO-terms 'protein dephosphorylation' and 'protein ubiquitination' (Kaur et al., 2019).

This might indicate the importance of epigenetic influences on
AEGs shaped by local parasite prevalence, playing a more longterm and heritable role on baseline gene expression, in contrast to the short(er)-term quick shifts in gene expression upon encounter with a parasite (Feldmeyer et al., 2016).
We found four Cytochrome P450 genes (6k1-like, 6a14, 9e2-like and 4C1-like) in our AEGs, with one of them also appearing in the parasite prevalence GWAS.The differential expression of certain P450 genes was observed to be indicative of not only disease, parasitic infections, but also CHC synthesis (Gihan et al., 2008;Holze et al., 2021;Lin et al., 2019;Morgan et al., 2020;Nadin et al., 1995;Samanta et al., 2003).While these functions can be more generally stated to be a stress response induced by increased parasite presence, the function in CHC synthesis might play a more impactful role in parasite recognition abilities and might help to create a nuanced and unique host chemical profile.

| CON CLUS ION
In this study, we combined environmental and trait association analyses to gain deeper insights into adaptive patterns associated with parasite prevalence and climate in the host species T. longispinosus.
We found evidence for selection on a number of odorant receptor genes and differences in cuticular hydrocarbon profiles, which are both relevant in the perception of the parasite.However, we also discovered a strong link between parasite prevalence and climate across the sampled populations with varying impacts of both environmental factors on allele frequencies and phenotypic traits.For example, the transcriptional activity in the antennae was highly linked to parasite prevalence, while CHC compositions were more tightly associated with climate.We identified several interesting candidate genes likely to play a role in host adaptation, which we plan to corroborate in detailed functional analyses at colony level.
expression analysis (PoolRNASeq), submerged in 100 μL of TRIzol® and incubated overnight.Samples were stored at −80°C until extraction.Dissections were split into 4 × 25 batches per population, randomly distributed over a 2-week span to avoid batch effects, with equal representation of colonies from all subsites.The second ant was taken for pooled gas chromatography and mass spectrometry (GC-MS) to identify cuticular hydrocarbons (CHCs) and determine the population-level CHC profile.Individual ants were pooled in a glass vial with ants from the same population and stored at −20°C until CHC extraction.For DNA extraction, the four batches per population were merged into a single tube containing 45 μL ATL buffer and four ceramic beads (ø1.4 mm).Tissue was then lysed at 30.0 Hz for 9 min using QIAGEN Tissue Lyser II Retsch MM400.DNA was isolated using the DNeasy Blood & Tissue Kit (Qiagen) following the manual for insect tissue.Pooled whole-genome sequencing was conducted by Novogene with 450 bp insert size to a mean coverage of 48×.For RNA extraction of pooled antennae, all four batches per population were merged in a new tube containing eight ceramic beads (ø1.4 mm).Tissue was lysed for a total of 12 min at 30.0 Hz using QIAGEN Tissue Lyser II Retsch MM400.RNA was extracted using the Direct-zol RNA Microprep Kit (Zymo).Sequencing of 2 × 150 bp reads was conducted by Novogene using Illumina NovaSeq.For CHC extraction, pooled ants were air dried for 5 min on a clean dust-free tissue paper under the fume hood and then transferred back to their original glass vial.Each glass vial was then filled with 350 μL mixed with 50 μL of n-octadecane internal standard solution concentrated at 0.01 μg/μL, and then incubated for F I G U R E 1 (a) Map of collection sites showing the varying degrees of parasite prevalence, that is, the percentage of T. americanus colonies (see Table S1 for state abbreviations and population names) and their differences in climate indicated by PC1 climate eigenvalues (see b). Photo ©Romain Libbrecht.(b) A Principal component analysis of climate conditions at our study sites (Table S1) based on CHELSA bioclim database variables (Data S1 'Climate_data'), with warm and dry populations on the left to cold and wet populations on the right.(c) Pairwise geographical distances of populations (x-axis) versus genetic distances based on F ST values.A trend in isolation by distance between populations, as indicated by a Mantel test, was observed.an additional 10 min.Ants were carefully removed from the glass vial without removing the solute.The volume of the solution was then concentrated using constant nitrogen flow and transferred into an inlay stored in the glass vial at −20°C before proceeding to the gas chromatography on Agilent Technologies 7890A and mass spectrometry.

F
Venn diagram of single nucleotide polymorphisms (SNPs) identified using Cochran-Mantel-Haenszel test statistics with populations with high versus low parasite prevalence (red) and high versus low PC1 climate eigenvalues (green) (p < .0001).Lower part of the figure lists enriched GO terms of the genes containing outlier SNPs (p ≤ .05;Data S7 'topGO_analysis'; only top 10 shown for CMH clim ).Corresponding Manhattan plots can be found in Figure S2a,b.

a
In reference genome (based on individuals from New York Huyck).
Positive correlation of relative abundance of linear n-alkanes with PC1 climate eigenvalues.Populations are sorted according to their PC1 eigenvalue, from warm/dry to cold/humid populations.(b) Relative abundance of recognition cues (as described in Jongepier & Foitzik, 2016), with populations sorted from low to high parasite prevalence, showing a non-significant trend for a positive correlation.a) Negative correlation of PC1 of antennal transcriptome data to parasite prevalence (r = −.85;p = .002).We further identified a weak correlation between PC1 antennal transcriptional activity and PC1 climate (r = .63;p = .051;Figure S7b).(b) GO enrichment analysis of AEGs linked to parasite prevalence (Data S7 'topGO_analysis').Three functions relating to post-translational modification, 'protein prenylation', 'protein ubiquitination' and 'protein dephosphorylation', were identified.Bubble size denotes number of found AEGs linked to an enriched function and colours denote p-values of enrichment.
Our study allows us to gain deeper insights into the many facets of host-parasite coevolution and the importance of eco-evolutionary dynamics.AUTH O R CO NTR I B UTI O N S Susanne Foitzik and Barbara Feldmeyer conceived the study.Maide Nesibe Macit, Erwann Collin, Susanne Foitzik and Barbara Feldmeyer designed the experimental set-up and collected ant colonies.Maide Nesibe Macit and Erwann Collin sampled colonies and performed dissections.Maide Nesibe Macit performed DNA and RNA extractions.Erwann Collin performed CHC extractions, and integrated and identified compounds.Maide Nesibe Macit with help from Markus Pfenninger and Barbara Feldmeyer performed PoolSeq, PoolRNASeq and CHC analyses.Maide Nesibe Macit wrote the first draft of the manuscript and all authors revised it.