An evolutionary compass for elucidating selection 1 mechanisms shaping complex traits

Polygenic selection is likely to target some human traits, but the speciﬁc evo- lutionary mechanisms driving complex trait variation are largely unknown. We developed an evolutionary compass for detecting selection and mutational bias that uses polarized GWAS summary statistics from a single population. We found evidence for selection and mutational bias acting on variation in ﬁve traits (BMI, schizophrenia, Crohn’s disease, educational attainment, and height). We then used model-based analyses to show that these signals can be explained by stabilizing selection with shifts in the ﬁtness-phenotype relationship. We additionally provide evidence that selection has acted on Neanderthal alleles for height, schizophrenia, and depression, and discuss potential sources of con- founding. Our results provide a ﬂexible and powerful framework for evolution- ary analysis of complex phenotypes in humans and other organisms, and provide 24 insights into the evolutionary mechanisms driving variation in human polygenic traits. empirical If novo standing for alleles alter the in one direction than potentially dynamics of future adaptation to changes in the trait optimum. Moreover, biases in mutation for may induce detectable patterns in the relationship between allele frequency and eﬀect size. Here, we develop a powerful method for detecting mutational bias and selection within a single 71 population using GWAS summary data. We show that the polarization of GWAS summary statistics by 72 their ancestral/derived state or Neanderthal/modern human state (which we refer to as an “evolutionary 73 compass”) provides information about the evolutionary processes shaping trait variation. We propose a 74 simple summary statistic of the relationship between eﬀect sizes and allele frequency, and show that it 75 is sensitive to both mutational bias and various models of selection. We apply our approach and ﬁnd 76 that selection and mutational bias are likely to have acted on the genetic variation underlying height, 77 BMI, educational attainment, Crohn’s disease, and schizophrenia, as well as Neanderthal alleles for 78 height, depression, and schizophrenia. We then develop a model-based inference procedure to disentangle 79 mutational bias from selection, and show that both processes are necessary to explain the observed 80 GWAS summary data. Lastly, we discuss caveats and possible explanations for idiosyncratic patterns in 81 height summary statistics across independent cohorts, as well as implications of our ﬁndings for human evolutionary history and GWAS of biomedically relevant traits.


Introduction 27
Natural selection shapes variation within and between human populations, but the phenotypic targets GWAS. The alleles were identified with the S * statistic (Plagnol and Wall, 2006) and allele sharing with Neanderthal genomes, and hence do not suffer from the same potential biases from ancestral uncertainty 228 as the derived/ancestral polarization. We performed both a common allele test (S β (0.05,0.25)) and an all 229 allele test (S β (0,0.25)) for each of the nine phenotypes. We exclude alleles above frequency 0.25 because 230 the vast majority of Neanderthal alleles have frequencies below 0.25. 231 Across the nine phenotypes we studied, we found strong signals for both height (p <5e-4) and major 232 depression (p <5e-4) (Fig. 4), and a marginally significant signal for schizophrenia that did not pass a 233 multiple testing correction (p =0.015). The height and schizophrenia signals were strongest when includ-234 ing only common alleles, whereas the depression signal was driven primarily by rare alleles. The height 235 signal suggests selection that favored Neanderthal height-increasing alleles, while the depression signal 236 suggests that Neanderthal alleles had a large impact on depression risk, which have been preferentially 237 pushed to low frequency by selection. The schizophrenia signal suggests selection against schizophrenia 238 risk or a bias towards protective alleles from Neanderthals. No other phenotypes had significant p-values. 239 Potential confounders. Although S β is not sensitive to confounding by ancestral uncertainty, we were 240 concerned that these signals could potentially be explained by other confounders, such as uncorrected on alleles with MAF> 1% and MAF> 5%, and found that the signals are robust within common alleles 245 alone, suggesting that population structure is unlikely to confound our estimates (Fig S9-17 and Tab. 1). 246 In addition, we computed the βDAF curve for subsets of SNPs that are minimally differentiated between 247 northern and southern Europe. There is a known gradient of height from south to north in Europe 248 (Turchin et al., 2012), so if the signals we observe were confounded by population stratification, we would 249 expect to see a flatter βDAF curve for less geographically differentiated SNPs at which little differentiation 250 in frequency exists between these regions. In contrast, we observe a pattern that is concordant across all 251 subsets of SNPS regardless of the frequency difference between northern and southern Europe (Fig. S7).

Discussion
it has long been theorized that it drives some of the variance in a broad range of human complex 262 phenotypes (Hernandez et al., 2011). Here, we developed a novel empirical framework (PASTEL) and 263 rejection sampling approach for detecting polygenic selection and mutational bias that can be applied 264 to GWAS summary data for a single population. Our approach can capture relatively ancient selection 265 events, and retains power when applied only to common alleles, for which it is more straightforward to 266 correct for population stratification. We call this approach an "evolutionary compass", because orienting 267 alleles by their ancestral/derived or modern human/Neanderthal status within our framework provides 268 insight into the evolutionary processes shaping complex traits. We applied this evolutionary compass 269 to GWAS summary data for nine phenotypes, and showed that five of them (educational attainment, 270 height, Crohn's disease, BMI, and schizophrenia) suggest a role for selection in shaping trait variation. precision medicine initiatives relying on personal genetic information will be most successful if genetic 290 risk can be accurately inferred in diverse populations. While this effect could be attributed to neutral demographic forces, if selection has driven numerous phenotypes to acclimate to local environmental 292 conditions in ancestral human populations worldwide it could exacerbate this problem dramatically.

293
In the field of evolutionary genomics, most studies have agreed that the impact of selection is 294 widespread on the human genome, but the evolutionary mechanisms that drive genetic and phenotypic 295 diversity have been widely debated (Hernandez et al., 2011;Enard et al., 2014). In our study, we showed 296 that the selection signals in Europeans for BMI and Crohn's disease are consistent with a bias in mutation 297 rate towards trait-increasing alleles, and a shift to a lower optimal value of the trait, while height and 298 educational attainment are consistent with a mutational bias towards trait-increasing alleles and a shift 299 towards higher values of the trait optimum. The signal for schizophrenia is consistent with an ancestral 300 shift towards a lower optimum and a stabilizing selection, with or without a mutational bias. It should be 301 noted that the model we used to fit these data assumed no more than one shift in the optimal phenotype 302 value, whereas this quantity is likely to vary continuously with environmental conditions. Models that     The most conservative interpretation of the non-replication of height is to suppose that the some of 334 the signals we and others observed in the GIANT cohort are driven by population structure, and the 335 UK Biobank analysis correctly removes this signal. Indeed, LD score regression on height GWAS data 336 is consistent with a moderate amount of residual uncorrected population structure (Bulik-Sullivan et al.,  would be tempting to identify intelligence as the target of selection here, it is possible that any (unknown) 354 trait with a shared genetic basis and correlated effect sizes could be the target. This remains a major 355 problem in detecting and interpreting selection signals on complex traits.

356
Among the nine traits that we tested, we found that five had strong signals of polygenic selection, and 357 11 . CC-BY-NC 4.0 International license certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (which was not this version posted June 7, 2018. . https://doi.org/10.1101/173815 doi: bioRxiv preprint seven of the nine had marginal evidence for non-neutrality. However, this does not imply that the others 358 are not under selection. The power of our test depends on the strength of selection, the polygenicity 359 of the trait, the heritability of the trait, and the mutational bias, in addition to the size of the GWAS.

360
If a trait is under strong stabilizing selection, but the mutation rate of trait-increasing and -decreasing 361 alleles is exactly equal, then our test has no power. Moreover, if selection is weak, a small number of 362 causal alleles drive variance in the trait, or the trait is only weakly heritable, power is greatly diminished.

363
However, increased sample sizes in GWAS will increase our power, because the variance on effect size 364 estimates for even weak effect causal alleles decreases with sample size. In addition to working directly 365 on GWAS summary statistics from a single population, one strength of our permutation-based approach 366 is that other informative statistics, such as the absolute value of the deviation between ancestral and 367 derived effect sizes, could also easily be applied, and may have higher power. In future studies, it will 368 be advantageous to apply and compare other summary statistics, and to use the information in such      Correcting subtle stratification in summary association statistics. bioRxiv , 076133.               An evolutionary compass to detect selection and mutational bias 540 Selection on quantitative traits is known to induce dependency between mean squared effect sizes and  Models with symmetric mutation rates have often been applied to quantitative traits, but there is 551 no a priori reason to suppose that this condition is met in natural populations. Indeed, it is possible 552 that there are an unequal number of fixed bi-allelic sites genome-wide that can increase or decrease a 553 phenotype relative to its current value, which naturally will change the relative rate of trait-increasing as 554 compared to trait-decreasing alleles. For example, if past selection events have driven the phenotype to 555 ever larger values, we might expect that the majority of trait-increasing alleles have already been fixed 556 by positive selection in the evolutionary past, and that further recurrent mutations at these fixed sites 557 would therefore decrease the phenotype.

558
Temporal variation in the optimal value of selected traits may also be an important determinant of the 559 evolutionary dynamics of complex traits, as such changes may be a mechanism for polygenic adaptation 560 (Jain and Stephan, 2017). When a population's environment is altered, perhaps by migration, a change 561 in climate, or the elimination/introduction of competing species, it is likely that selection pressures on 562 phenotypes will also change. If the population persists long enough in this new environment it is expected 563 that the phenotype mean will approach the new phenotype optimum, and the population will again be 564 at equilibrium. In the intervening time, while the population is out of equilibrium, allele frequencies will 565 shift as a function of their effect on the phenotype. Our goal is to capture selection's impact on the causal 566 variation during this out-of-equilibrium time period.

567
In this project, we sought to capture how mutational bias and selection pressure would affect the 568 relationship betweenβ and allele frequency, and to use this information in designing a statistical test to 569 detect mutational bias and selection. In the following sections, we use analytical theory and simulations 570 to build intuition about the relationship between these evolutionary processes and patterns that could 571 be observed in GWAS summary statistic data.

572
Building intuition with the βDAF plot and S β

573
To build intuition about how evolutionary parameters affect the relationship between β and DAF we 574 performed analytical calculations and simulations of simple evolutionary models that included directional 575 selection, stabilizing selection, and mutational bias. Throughout these calculations, we suppose that 576 although mutation rate can be biased towards trait-increasing or trait-decreasing alleles, the distribution 577 of trait-increasing alleles is the same as that for trait-decreasing alleles.
Hence, if the mutational bias is δ is the proportion of de novo mutations that are trait-increasing and the mean of the absolute value of a new mutation is . Note that 583 when δ = 0.5, mutation rates are symmetric, and E[β] = 0.

584
Supposing that we decompose the frequency spectrum into k bins, when there is no selection . We perform this calculation for a variety of values of δ and a 586 symmetric distribution of effect sizes, as well as stochastic simulations in which we select effect sizes 587 randomly (Fig. S1D), and observe excellent agreement between this simple calculation (solid lines) and 588 the simulations (points). captures the idea that many alleles will have weak effects, and hence weak selective effects, while a small 599 portion of alleles may have strong effects on the trait and be under relatively strong selection. We further 600 suppose that the distribution and mutation rate of trait-increasing are the same as trait-decreasing alleles. 601 We assume that the influence of binomial sampling is small and we do not account for it.
where Γ[a, b, γ] = a b Γ(a) γ a−1 e −bγ , a and b are the parameters of the Γ-distribution, ζ is the Riemann zeta 609 function, and γ is the population-scaled selection coefficient 2N s. Trait-decreasing alleles have a neutral 610 frequency spectrum, which is given by f − a,b (x) = 1 x . The mean value of β for trait increasing alleles at 611 frequency x is given by .
(3) For trait-decreasing alleles, the mean value of β does not depend on the derived allele frequency x, and 613 is given by Combining terms and weighting by the frequency spectra for trait-increasing and trait-decreasing 24 . CC-BY-NC 4.0 International license certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (which was not this version posted June 7, 2018. . https://doi.org/10.1101/173815 doi: bioRxiv preprint alleles, we find that the mean value of β as a function of DAF x (β(x)) is given by . results show that trait-increasing alleles will be more likely to increase in frequency (Fig. S1B), and will 618 outnumber trait-decreasing alleles at all frequencies, resulting in β a,b (x) > 0 at all frequencies (Fig. S1C).

619
This implies that under this simple polygenic adaptation model, the S β will increase monotonically in

634
Under this model, we are able to calculate β a,b (x) using the same procedure as in the previous section. 635 We find

654
We develop a polygenic selection quantitative trait model that maps selection coefficients s to effect 655 sizes β using the well-established Gaussian stabilizing selection model (Robertson, 1956; Barton, 1986). 656 We suppose that stabilizing selection acts on a trait, and that the fittest value of the trait is φ o (also 657 referred to as the "trait optimum"), such that the fitness f of an individual with trait value φ is given by where w is the breadth of the fitness function. We additionally suppose that the trait φ has a normal 659 distribution such that where σ is the standard deviation of the fitness distribution andφ is the mean trait value in the population. the dependence on the current mean phenotype value (φ) and the phenotype optimum (φ o ), rather than 666 assuming that the phenotype distribution is centered at the optimum. While the full expression for the 667 expected change in allele frequency is provided in the next section, we note that when the trait is at and the expected frequency change δp for an allele at frequency p with effect size β whenφ = φ o is

26
. CC-BY-NC 4.0 International license certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (which was not this version posted June 7, 2018. . https://doi.org/10.1101/173815 doi: bioRxiv preprint substituting in P (φ) and f (φ), these integrals can be solved to obtain w aa ∝ e −(2pβ+φo −φ) 2 2(w 2 +σ 2 ) , w ad ∝ Taking a series expansion about β = 0, we find that For the simulations presented in Fig. 1, the curves represent the mean over 100 independent simu-724 lations. We used N = 7, 000 (where N is the ancestral population size) for the simulations in Fig. 1.,   725 but rescaled the population size for computational efficiency to 2,000 for parameter inference (see next 726 section). The simulation code was implemented in Python and numpy, and will be freely available and

733
Our goal in these simulations is to understand the qualitative behavior of S β as a function of evolutionary 734 parameters (Fig. 1C-E), and in subsequent sections we develop a more quantitative approach for mapping 735 evolutionary parameters to observed patterns in GWAS summary statistics.

748
We performed simulations of Gaussian stabilizing selection with shifts in the optimal phenotype value 749 using parameter values that were sampled from broad prior distributions, and we fit these simulated data 750 to the observed S β (0.01, x) data for our phenotypes (we exclude alleles below 1% frequency to avoid 751 the potential effects of rare variant stratification). As summary statistics, we take S β (0.01, x) for x ∈ have effect size 0, representing the portion of the genome that is not causal for the trait but is included 777 in the GWAS. We calculate the sum of squared differences between our simulated and observed statistics 778 for all 10 5 simulations, and retain the 0.25% of simulations that minimize this statistic.

779
To test our rejection sampling method, we masked the input parameter set of each simulation and 780 attempted to infer the parameter values that were used to run the simulation. We used the maximum 781 a posteriori estimate as a point estimate of the parameter value for these experiments. We found high 782 predictability of the direction of both mutational bias δ and ∆φ (Fig. 3C&D), and noisy estimation of 783 the magnitude of these parameters. Most other parameters were poorly predicted. We found that the 784 true timing of selection events was highly correlated with the inferred timing, but a large number of 785 outlier estimates make the estimates of this parameter less trustworthy, so we do not report it. All other 786 parameters were inferred with low accuracy, hence we only report δ and ∆φ.

787
In addition to the power to detect mutation rate bias (δ) and shifts in optimal phenotype value (∆φ), 788 we also considered the false positive rate for detection of these parameters (Fig. S5). We examined the 789 subset of simulations among our 10 5 that had |∆φ| < 0.25 and the subset with 0.45 < δ < 0.55 and 790 assessed the proportion of these simulations for which we estimated maximum a posteriori parameters 791 larger than y for all estimated values y. We found that the false positive rate is high for both small 792 mutation rate biases and weak shifts, but low for effects as large as those we infer in the data, implying 793 that our method is unlikely to result in a multiple false positives for shifts and mutation rate biases.

Neanderthal alleles
generously provided by Rajiv McCoy and Josh Akey. In analyses using Neanderthal alleles, we masked 855 all alleles that were not present in this dataset. The dataset includes 139,694 SNPs, 139,381 of which we 856 were able to map to rs numbers and use in our analysis.

857
The impact of ancestral uncertainty on S β 858 Ancestral states are not directly observed in genomic data, and are typically inferred by comparing human 859 sequences to those of an out-group. The underlying assumption is that the allelic state in the out-group 860 represents the most likely ancestral state for the allele. While this approach correctly assigns the ancestral 861 state for the majority of alleles, it does not account for recurrent fixation events at a single site, leading 862 to some rate of ancestral misassignment. Since our method to detect selection compares ancestral and 863 derived alleles, we sought to understand how ancestral misassignment would impact our inferences.

864
Ancestral misassignment will tend to decrease the absolute value of S β within each frequency bin, 865 and hence decrease our power. To see this, suppose that for a given minor allele frequency x, we misassign 866 ancestral state with probability 1 − p, and Ψ(x) is the number of derived alleles observed at frequency x.

867
We note that we can rewrite S β as the difference in mean beta between ancestral and derived alleles of 868 equal minor allele frequency (i.e., where β D is the effect size of derived alleles and β A is the effect size of ancestral alleles. Note that if

874
While this analysis shows that ancestral uncertainty serves to make our analyses more conservative, 875 we also sought to understand its potential impact on the observed relationship between allele frequency 876 and effect size (Fig. S8). Because Ψ(x) is generally much larger than Ψ(1 − x) for x < 0.5 (i.e., there are 877 more rare alleles than common alleles in genomic sequencing data), we expect that the impact of ancestral 878 uncertainty will be greatest at very high derived allele frequencies (e.g., x > 0.9). We fit a linear model 879 by regressing the mean effect size on allele frequency for the observed height data for derived alleles in the 880 frequency range 40-60%, and extrapolated this curve out to 100% frequency, supposing that frequencies 881 near 50% were only modestly impacted by ancestral uncertainty. We then simulated the impact of 882 ancestral uncertainty at various levels from p = 1% to p = 10%. At p = 10% uncertainty we see a 883 striking resemblance between our simulated data and the observed data. We conclude that moderate 884 levels of ancestral uncertainty are likely responsible for the "S " shaped curve that we observe for many 885 of our phenotypes.  Figure S6: Correlation between mean effect size and allele frequency (A) and correlation the difference in frequency between northern and southern Europe for the height increasing allele and p-value rank (B) in the GIANT study (gold) and the UK Biobank (gray). Note that alleles were grouped in to bins of 2,000 by p-value and were not thinned by linkage.

49
. CC-BY-NC 4.0 International license certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (which was not this version posted June 7, 2018. . https://doi.org/10.1101/173815 doi: bioRxiv preprint