Genetics of adaptation: Experimental test of a biotic mechanism driving divergence in traits and genes

Abstract The genes underlying adaptations are becoming known, yet the causes of selection on genes—a key step in the study of the genetics of adaptation—remains uncertain. We address this issue experimentally in a threespine stickleback species pair showing exaggerated divergence in bony defensive armor in association with competition‐driven character displacement. We used semi‐natural ponds to test the role of a native predator in causing divergent evolution of armor and two known underlying genes. Predator presence/absence altered selection on dorsal spines and allele frequencies at the Msx2a gene across a generation. Evolutionary trajectories of alleles at a second gene, Pitx1, and the pelvic spine trait it controls, were more variable. Our experiment demonstrates how manipulation of putative selective agents helps to identify causes of evolutionary divergence at key genes, rule out phenotypic plasticity as a sole determinant of phenotypic differences, and eliminate reliance on fitness surrogates. Divergence of predation regimes in sympatric stickleback is associated with coevolution in response to resource competition, implying a cascade of biotic interactions driving species divergence. We suggest that as divergence proceeds, an increasing number of biotic interactions generate divergent selection, causing more evolution in turn. In this way, biotic adaptation perpetuates species divergence through time during adaptive radiation in an expanding number of traits and genes.

We measured the length of the first dorsal spine, pelvic spine, and standard length. If a given trait was absent it was given a value of zero. Both spine traits scaled linearly with standard length (assessed using linear models), so trait values were size corrected to compare trait values among individuals of different sizes (see Figure S3 for an example of the relationship before and after correction). Traits were size corrected to an average length (32 mm, which was the mean size of individuals in September 2012 F2 cohort; see Figure S4 for the size distributions across ponds and time points) using the equation: " = " − ( " − * ), where " is the size-adjusted armour trait, " is the original armour trait, is the regression coefficient of the un-adjusted armour trait values on standard length, " is the standard length of the individual and * is the average length of the sample (2). Trait values of zero (i.e. when the trait was absent) were not corrected for size. When the trait values were plotted it was apparent that pelvic spine phenotypes fell into two distinct clusters, 'high' and 'low' armour.
We used Gaussian Mixture Modelling for model-based clustering, using the mclust package (3), to identify these high and low pelvic phenotypes. These two pelvic armour clusters had different relationships (β) between standard length and the focal trait value; correspondingly size correction was done independently for the two clusters. Corrections were done independently for each time period. For some time periods, linear models indicated that pond had a significant effect on β, in which case size correction was done independently for each pond. All analyses reported in this paper were undertaken using these size corrected measurements.

Library preparation, sequencing and genotyping
DNA was prepared for Illumina sequencing using the PstI enzyme following the genotyping by sequence method of Elshire et al. (4), with the addition of a gel size selection of fragments 500 -700 base pairs (bp) in length. One hundred and ninety-two individuals were uniquely barcoded and combined into each library, for a total of seven libraries.
Libraries were sequenced at the University of British Columbia's Biodiversity Next Generation Sequencing Centre on an Illumina HiSeq 2000. Reads were 100 bp in length and sequencing was paired end.
Sequence variants were identified using a standard, reference-based bioinformatics pipeline (see archived code for full details). After demultiplexing, Trimmomatic (5) was used to filter out low quality sequences and adapter contamination. Reads were then aligned to the stickleback reference genome (6) and a bacterial artificial chromosome (BAC) sequence containing the complete pituitary homeobox 1 (Pitx1) coding sequence (7) (GeneBank Accession: GU130435.1) using BWA v0.7.9a (8), subsequent realignment was done with STAMPY v1.0.23 (9). For genotyping the GATK v3.3.0 (10) best practices workflow (11) was followed except that the MarkDuplicates step was omitted. RealignTargetCreator and IndelRealigner were used to realign reads around indels and HaplotypeCaller identified single nucleotide polymorphisms (SNPs) in individuals. Joint genotyping was done across all individuals using GenotypeGVCFs. The results were written to a single VCF file containing all variable sites. This file was filtered for a minimum quality score (of 20) and depth of coverage (minimum of 8 reads and maximum of 100,000) before use in any downstream analyses.

Linkage and quantitative trait locus (QTL) mapping
The pedigree of F2 individuals was determined using the MasterBayes R package (12) using 1799 SNPs, which had minimal missing data across individuals (>90% of individuals with data). To have markers that were fully informative for linkage mapping we identified the SNPs that were homozygous for alternative alleles in the benthic and limnetic grandparents of each F2 cross. We then used these SNPs to calculate pairwise recombination frequencies and create a genetic map using JoinMap version 3.0 (13). In total 398 F2 progeny from four SNPs for the Pitx1 locus). Selection was estimated for each SNP independently, then the individual estimates of selection were averaged across the SNPs that were informative in each family.

Selection Analyses
We estimate the standardized evolutionary response of phenotype in Haldanes (h) as: where Δ ̅ is evolutionary response, Δ ̅ = ( ̅ after -̅ before)/ 3pooled, ̅ before and ̅ after are the mean phenotype or allele frequency in the generation before and after selection, 3pooled is the square root of pooled sample variance of the trait or allele frequency in the first and second generation, and g is the number of generations of selection, which in our case is 1 (15).
We estimate the treatment effect of evolutionary response within a pond pair (Δh) as: where ht is the evolutionary response in the trout addition pond and hc is the evolutionary response in the control pond.
We used the same formulas and methods to estimate the evolutionary response and treatment effect for allele frequencies at SNP markers near the QTLs for pelvic spine and first dorsal spine length. The input was the mean limnetic allele frequency for the individuals in the sample, standard deviation was also estimated from these allele frequencies.
We used linear models to describe the phenotypic trait trajectories through time. These models included a quadratic term which allowed us to model curvature in the trajectories through time. We quantified the difference between treatments within a family for both curvature and linear slope as follows: