Large Allele Frequency Differences between Human Continental Groups are more Likely to have Occurred by Drift During range Expansions than by Selection
Summary
Several studies have found strikingly different allele frequencies between continents. This has been mainly interpreted as being due to local adaptation. However, demographic factors can generate similar patterns. Namely, allelic surfing during a population range expansion may increase the frequency of alleles in newly colonised areas. In this study, we examined 772 STRs, 210 diallelic indels, and 2834 SNPs typed in 53 human populations worldwide under the HGDP‐CEPH Diversity Panel to determine to which extent allele frequency differs among four regions (Africa, Eurasia, East Asia, and America). We find that large allele frequency differences between continents are surprisingly common, and that Africa and America show the largest number of loci with extreme frequency differences. Moreover, more STR alleles have increased rather than decreased in frequency outside Africa, as expected under allelic surfing. Finally, there is no relationship between the extent of allele frequency differences and proximity to genes, as would be expected under selection. We therefore conclude that most of the observed large allele frequency differences between continents result from demography rather than from positive selection.
Introduction
On a worldwide scale, human populations show a large phenotypic variability, particularly for skin colour, face and body shapes, susceptibility to pathogens, as well as for the prevalence of genetic diseases (Lewontin, 1995). However, most of the genetic variation in humans is found within populations rather than among populations or geographic regions (Lewontin, 1972, Barbujani et al. 1997, Rosenberg et al. 2002b). Still, many studies have focused on traits or loci showing geographically restricted distribution, or on loci showing drastic allele frequency differences between two regions. These particular cases can indeed reveal important information about local selective pressures or about the demographic histories of different populations (Balaresque et al. 2007). It is however difficult to disentangle the effects of positive selection from those of demography, since past demographic events such as population bottlenecks or range expansions can mimic the genetic signatures of a selective sweep like long range linkage disequilibrium and reduced allelic diversity.
The colonisation of the world by modern humans was probably accompanied by a series of founder effects with subsequent local population expansions (Handley et al. 2007). Strong bottlenecks have also certainly occurred during the exit out of Africa and at the onset of the colonisation of the Americas by people from Asia (Fagundes et al. 2007, Goebel et al. 2008). These bottlenecks, followed by a spatial expansion, can lead to the geographic spread of an allele that rides on the wave of advance of the spatial expansion, a phenomenon called allelic surfing (Edmonds et al. 2004, Klopfstein et al. 2006, Travis et al. 2007). New mutations arising on the wave front and extant alleles may surf successfully (Excoffier & Ray, 2008), spreading geographically and increasing in frequency in the newly colonised areas (Klopfstein et al. 2006). A combination of simulation, analytical and experimental studies have shown that the probability for an allele to successfully surf is increased in the presence of spatial bottlenecks, when local deme size is small, and when populations at the wave front grow rapidly and exchange few genes with their neighbours (Klopfstein et al. 2006, Hallatschek et al. 2007, Travis et al. 2007, Excoffier & Ray, 2008, Hallatschek & Nelson, 2008). This neutral process has received much attention recently because of its consequences on allele frequencies that mimic selective processes (Nielsen et al. 2007).
However, it is clear that human populations colonising novel habitats have been confronted by new selective pressures due to their exposure to different climate, food sources, and pathogens (Balaresque et al. 2007). Some of these selective pressures certainly triggered local adaptation that impacted on allele frequencies at several loci. However, neutral allele surfing, like selection, will also occur at only a few loci, and will therefore not affect all loci uniformly, like other demographic factors such as demographic expansions, inbreeding or bottlenecks.
Until recently, most human genes showing strong geographic structures were considered to be under positive selection (see Table 1, where 44 such genes are listed). Most of these genes show a marked difference in allele frequencies (typically larger than 20%) between African and non‐African populations. In many of these studies, local selection outside Africa was thought to have promoted these large allele frequency differences. Prominent examples are two genes that are involved in the control of brain size, MCPH1 and ASPM (Evans et al. 2005, Mekel‐Bobrov et al. 2005). Both genes showed an increased frequency of a derived allele outside Africa and high levels of linkage disequilibrium. The authors therefore hypothesised that the derived haplotypes were under local positive selection in non‐African populations. However, Currat et al. (2006) showed by spatially‐explicit simulations that similar geographic distributions of allele frequencies could be generated by neutral allelic surfing during the range expansion outside Africa.
In this study, we explore data from the HGDP‐CEPH Diversity Panel consisting of 772 STRs, 210 insertion‐deletion polymorphisms and 2834 SNPs typed in 53 populations worldwide to determine the prevalence of large allele frequency differences between regions. We find that large allele frequency differences between continental regions are extremely common, as they occur at almost one third of all loci. We discuss the respective role of selection and demographic factors for shaping these patterns in the light of geographic and genomic information.
Material and Methods
Data
We analysed three multilocus data sets containing short tandem repeats (STR), insertion‐deletion polymorphisms (indel) and single nucleotide polymorphisms (SNP), respectively, typed in 53 worldwide populations belonging to the CEPH Human Genome Diversity Panel (Cann et al. 2002, Rosenberg et al. 2002b, Ramachandran et al. 2005, Conrad et al. 2006). The individuals analysed correspond to the H1048 subset defined by Rosenberg (2006), which excludes atypical and duplicated samples. The datasets were downloaded from the web site http://rosenberglab.bioinformatics.med.umich.edu/diversity.html.
Initially, the STR data set contained 783 loci typed in 1048 individuals, but we have removed eleven loci showing overall more than 10% missing data (GATA43C11, GGAA22E01, GATA193D02, GATA135F02P, AAC023, ATT015, ATT077P, GATA63C02, ATA109H09, GATA7F09, and TTTA033), and we thus analysed a total of 9210 alleles at 772 STR loci. We also examined 210 diallelic indels that were typed in the same 1048 individuals, as well as 2834 SNP loci that were typed in a subset of 927 individuals.
The populations were grouped in five main geographic regions, following Rosenberg et al. (2002b): Africa, Eurasia, East Asia, America, and Oceania (Excoffier, 2003, see also Bastos‐Rodrigues et al. 2006, Li et al. 2008). A complete list of the populations is found in Table S1.
Analyses
STRs, indels and SNPs data sets were analysed separately. We used ARLEQUIN ver 3.11 (Excoffier et al. 2005) to calculate the average frequency of each allele in the populations. The R statistical package (R Development Core Team, 2008) was used to develop scripts for the analyses listed below.
For each allele i, we computed the average allele frequency
within each geographic region j, as well as the difference with the average frequency computed over all other populations as
, where
is the average frequency of allele i in all populations not belonging to the geographic region j. This was done for all regions except Oceania, because there are only two populations in this region, and therefore the average frequency is subject to large fluctuations and the power to detect significant differences is low. For STR data, we also computed for each locus the index ΔFmax as the largest absolute value of ΔF found among all alleles present at that locus. This index ΔFmax allows us to characterize allele frequency differences at each locus with a single statistic, like in the case of diallelic loci. For diallelic loci, ΔFmax=|ΔF|.
We randomly permuted populations between regions and recomputed each time ΔF, to obtain its null distribution and test for the significance of ΔF for each allele. The same permutation procedure was used to test if the number of alleles with a given frequency difference (kΔF) between a region and the rest of the world was significantly larger than expected by chance.
We also introduced a procedure to test if a random set of populations that are geographically close to each other also present sharp allele frequency differences with the rest of the world. Taking geography into account is actually a more stringent test of allele frequency differences than a procedure based on free permutation of random populations, because populations closer to each other tend to be more similar than populations at greater distance, due to isolation by distance and shared history. However, when regions consist of only a small number of populations, such as America, the number of possible random groups is reduced. kΔF was tested by taking geographical constraints into account as follows: a random population is assigned to the group representing the tested region, and the other populations allocated to this group are drawn at random from the 2Pj– 1 geographically closest populations, where Pj is the number of populations in the tested region. The geographic distance between populations was computed as the shortest distance on land (i.e. least‐cost path avoiding seas) using the software PATHMATRIX (Ray, 2005).
If allelic surfing was a major driving force behind allele frequency differences, we would expect to find more STR alleles with a higher frequency in newly colonised areas, because surfing promotes the increase in frequency of low frequency alleles. However we would not necessarily expect to find any asymmetry in the direction of frequency change of derived SNP and indel alleles, since surfing should affect equally ancestral and derived alleles. We tested these predictions by performing a sign test on the number of alleles having increased or decreased in frequency outside a region of interest. The ancestral allele for each human SNP was inferred by comparisons with orthologous alleles in the chimpanzee and rhesus macaque genome assemblies, available in the Table Browser at the UCSC Genome Bioinformatics Site (http://genome.ucsc.edu/, table snp128OrthoPanTro2RheMac2, (Karolchik et al. 2008)). The ancestral allele was assumed to be identified if both the chimpanzee and macaque alleles were described and identical, or if an allele was only known in one of these two species. If orthologous alleles were known in both species but were different from each other, the ancestral allele was assumed to be the chimpanzee allele if the human variants contained the chimpanzee allele but not the macaque allele. In all other cases the ancestral allele was assumed to be unknown. Likewise the ancestral state of the indels was inferred by comparing human allelic diversity to orthologous alleles in the chimpanzee and in the gorilla (Weber et al. 2002). In this way, we were able to determine the ancestral allelic states of 176 indel and that of 1530 SNP loci. We then used the R function ‘sign.test’ (Package BSDA; (Arnholt, 2007)) to perform a sign test allowing us to determine if there is any asymmetry in the frequency change of derived alleles. The genomic positions of a subset of 476 STRs, 162 indels and 2784 SNPs could be determined in the NCBI Build 35‐reference system. The distance to the nearest gene was computed for each of the mapped loci, and varied from 0 (when the marker is found within the transcript of a gene) to 73.9 Mb. We computed Pearson correlation coefficient between ΔFmax and marker distance to the closest gene to assess whether there was any relationship between these two variables.
Results
We tested whether populations belonging to the same region have more similar allele frequencies than expected by chance due to shared demographic history or shared selective events. Indeed, they show more similar allele frequencies than random populations, as the number of alleles showing ΔF > 0.2 for a given comparison is always significantly larger than expected by chance when tested with the random population permutation procedure (Tables 2–4 and Tables S2‐S4). However, this is not always the case when tested with the geographically explicit permutation test, when randomized regions are made up of spatially neighbouring populations. In the STR dataset all positive frequency differences between America and the rest of the world that are larger than 0.2 are non‐significant (Table 2 and Table S2). Additionally some of the larger frequency differences between America and the rest of the world in the indel dataset are also non‐significant (Table 3 and Table S3). The geographically explicit permutation test is expected to be more stringent, as geographically close populations are genetically often more similar than random populations. However, if there are only few populations in a region, as is the case for the Americas, the geographically explicit permutation test is too stringent because the number of different random groups is reduced. Allele‐specific ΔF was therefore tested with the random permutation procedure only and it is found significant in all cases as soon as ΔF > 0.25. We therefore chose an arbitrary threshold for ΔF of 0.3 to define a set of alleles with significant ΔF to summarise the results.
(in the upper part of the table) indicate that the alleles have a lower frequency within African (or American) populations than in the non‐African (or non‐American) populations (because
).
| ΔF | Africa vs. non‐Africa | America vs. non‐America | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Allelesa | significantb | p‐value 1c | p‐value 2d | Locie | significantf | Allelesa | significantb | p‐value 1c | p‐value 2d | Locie | significantf | |
| 0.65–0.7 | 0 | 0 | ||||||||||
| 0.6–0.65 | 1 | 1 | ** | ** | 1 | 1 | 0 | |||||
| 0.55–0.6 | 0 | 0 | ||||||||||
| 0.5–0.55 | 5 | 5 | ** | ** | 5 | 5 | 0 | |||||
| 0.45–0.5 | 9 | 9 | ** | ** | 9 | 9 | 1 | 1 | * | 0 | 0 | |
| 0.4–0.45 | 9 | 9 | ** | ** | 9 | 9 | 1 | 1 | * | 0 | 0 | |
| 0.35–0.4 | 19 | 19 | ** | ** | 17 | 17 | 6 | 6 | ** | 6 | 6 | |
| 0.3–0.35 | 24 | 24 | ** | ** | 22 | 22 | 13 | 13 | * | 6 | 6 | |
| (−0.3) −0.3 | 9122 | 4604 | 693 | 609 | 9049 | 3916 | 625 | 568 | ||||
| (−0.3)–(−0.35) | 9 | 9 | ** | * | 6 | 6 | 53 | 53 | ** | ** | 49 | 49 |
| (−0.35)–(−0.4) | 8 | 8 | ** | * | 7 | 7 | 34 | 34 | ** | ** | 33 | 33 |
| (−0.4)–(−0.45) | 2 | 2 | ** | * | 1 | 1 | 24 | 24 | ** | ** | 24 | 24 |
| (−0.45)–(−0.5) | 1 | 1 | ** | * | 1 | 1 | 15 | 15 | ** | ** | 15 | 15 |
| (−0.5)–(−0.55) | 1 | 1 | ** | * | 1 | 1 | 5 | 5 | ** | ** | 5 | 5 |
| (−0.55)–(−0.6) | 0 | 7 | 7 | ** | ** | 7 | 7 | |||||
| (−0.6)–(−0.65) | 0 | 2 | 2 | ** | ** | 2 | 2 | |||||
- aTotal number of alleles with a given ΔF. Note that we have used semi‐open ΔF intervals (]x‐y]) to assign alleles to particular intervals, such that for instance a ΔF value of 0.4 was put in the interval 0.35–0.4.
-
bNumber of alleles with a significant ΔF (
).
- cp‐value for the number of alleles with a given ΔF using random population permutations (* <= 0.05, ** <= 0.001);
- dSame as c, but constraining permutations by geography (see Methods; * <= 0.05, ** <= 0.001).
- eNumber of loci with a given ΔFmax value.
- fNumber of loci with a significant allele frequency difference.
| ΔFmax | Africa vs. non‐Africa | America vs. non‐America | ||||||
|---|---|---|---|---|---|---|---|---|
| Locia | significantb | p‐value 1c | p‐value 2d | Loci a | significantb | p‐value 1c | p‐value 2d | |
| 0.75–0.8 | 0 | 0 | ||||||
| 0.7–0.75 | 1 | 1 | ** | ** | 1 | 1 | ** | * |
| 0.65–0.7 | 2 | 2 | ** | ** | 0 | |||
| 0.6–0.65 | 1 | 1 | ** | ** | 0 | |||
| 0.55–0.6 | 4 | 4 | ** | ** | 0 | |||
| 0.5–0.55 | 3 | 3 | ** | * | 2 | 2 | ** | * |
| 0.45–0.5 | 10 | 10 | ** | ** | 1 | 1 | * | |
| 0.4–0.45 | 14 | 14 | ** | ** | 6 | 6 | ** | * |
| 0.35–0.4 | 14 | 14 | ** | * | 8 | 8 | ** | |
| 0.3–0.35 | 12 | 12 | ** | * | 11 | 11 | ** | |
| 0–0.3 | 149 | 104 | 181 | 93 | ||||
- Table header is defined in Table 2.
| ΔFmax | Africa vs. non‐Africa | America vs. non‐America | ||||||
|---|---|---|---|---|---|---|---|---|
| Locia | significantb | p‐value 1c | p‐value 2d | Locia | significantb | p‐value 1c | p‐value 2d | |
| 0.75–0.8 | 3 | 3 | ** | ** | 0 | |||
| 0.7–0.75 | 10 | 10 | ** | ** | 0 | |||
| 0.65–0.7 | 1 | 1 | ** | * | 1 | 1 | ** | * |
| 0.6–0.65 | 14 | 14 | ** | ** | 5 | 5 | ** | * |
| 0.55–0.6 | 31 | 31 | ** | ** | 13 | 13 | ** | * |
| 0.5–0.55 | 38 | 38 | ** | ** | 19 | 19 | ** | * |
| 0.45–0.5 | 62 | 62 | ** | ** | 22 | 22 | ** | * |
| 0.4–0.45 | 89 | 89 | ** | ** | 60 | 60 | ** | * |
| 0.35–0.4 | 136 | 136 | ** | ** | 72 | 72 | ** | * |
| 0.3–0.35 | 129 | 129 | ** | * | 143 | 143 | ** | * |
| 0–0.3 | 2321 | 1484 | 2499 | 1303 | ||||
- Table header is defined in Table 2.
Overall we find that large allele frequency differences between geographic regions are extremely frequent (Tables 2–4 and Tables S2–S4). Indeed, 215 of the 772 STR loci (27.9%), 90 out of 210 indel loci (42.9%) and 913 of the 2834 SNP loci (32.2%) have ΔFmax > 0.3 for at least one comparison. Among these, 18.1% of the STR loci with ΔFmax > 0.3 show such a large ΔFmax for more than one comparison, while for the indels and SNPs this fraction is 28.9% and 18.1%, respectively. Note that the total number of loci with ΔFmax > 0.3 is smaller than the sum of the number of loci with ΔFmax > 0.3 involved in the different comparisons that can be computed from Tables S2–S4, because a given locus can show large allele frequencies in more than one continental comparison. The largest observed ΔF (0.79) was found between African and non‐African populations for the SNP locus ‘rs5972561’ (see below in Figure 4I).

Examples of spatial distribution of alleles with large ΔFs. Black pies represent the frequency of a given allele, and its average frequency within (WR) and out of (OR) the region of interest is shown on the bar plot. Whiskers in the bar plots represent standard deviations. A: allele 298 at the ATA1F08 STR locus (ΔF = 0.45), 18.7 Kb away from closest gene UTRN. B: allele 176 at the GATA84B12 STR locus (ΔF = 0.56), 106.3 Kb away from closest gene CCDC54. C: allele 111 at the GGAA20G10 STR locus (ΔF=0.51), 628 bp away to closest gene E2F6. D: allele 190 at the GATA11C08 STR locus (ΔF = 0.41) 149.0 Kb away from closest gene STARD13. E: indel locus rs2307832 (ΔF = 0.74), 14.9 Kb away from closest gene USP24. F: indel locus rs133052 (ΔF = 0.72), 9.7 Kb away from closest gene MKL1. G: SNP locus rs6431253 (ΔF = 0.54), 169.2 Kb away to closest gene ARL4C. H: SNP locus rs2252199 (ΔF = 0.53), 30 Kb away from closest gene HSPA13. I: SNP locus rs5972561 (ΔF = 0.79), located in the gene DMD. J: SNP locus rs5959428 (ΔF = 0.52), 323.6 Kb away from closest gene ITM2A.
In the comparisons of Africa and America to the rest of the World, the allele frequency differences are strikingly large (Tables S2‐S4), as expected under the surfing out‐of‐Africa hypothesis. When Africa is contrasted to the rest of the world the fraction of loci with ΔFmax > 0.3 is 10.2%, 29.0%, and 18.1%, for STRs, indels, and SNPs, respectively, and these fractions are 19.0%, 13.8%, and 11.8%, respectively, for the Americas. For the Eurasian and East Asian regions, these numbers are much lower, and vary between 1.2% and 8.6%. In keeping with these results, ΔF's are actually never as large in the comparisons of Eurasia and East Asia as in other comparisons. For instance, STRs do not show any allele with ΔF > 0.45 in Eurasia or in East Asia, whereas ΔF reaches 0.6 in Africa and 0.65 in America.
Given their large mutation rate, it may seem surprising that STR alleles show ΔF as large as those observed for SNPs and for indels if these differences had been created during the expansion out‐of‐Africa some 50 to 60 thousand years ago. Over time, mutations are indeed expected to erode large initial frequency differences at neutral loci, and thus large ΔF (50% or more) could be better explained by their maintenance due to selection. In order to check how quickly mutations would lower the frequency of an allele initially fixed in a population, we have carried out simple simulations at STR loci of an unsubdivided population under a pure stepwise mutation model. We have reported this decrease over 2000 generations in Figure S1 for different mutation rates and different effective population sizes. As expected the rate of decrease is positively correlated with mutation rate, and its variance is negatively correlated with population size. However, for a mutation rate of 5×10−4, the allele frequency is still about 65% after 1,000 generations and 46% after 2,000 generations. For a lower mutation rate of 10−4, the mean expected frequencies are 91–92% and 83–85% after 1,000 and 2,000 generations, respectively, depending on the effective population size. Given the relatively large variance of mutation rates for human STR loci (Xu et al. 2005), it appears therefore likely that STR allele frequencies of more than 80% could still be observed after 2,000 generations if they were initially fixed by surfing or a strong bottleneck, without the need to invoke selection for their maintenance. Still, one would expect that loci with high mutation rates would show lower allele frequency differences today. Since heterozygosity is positively correlated with mutation rate for STRs (Kimmel & Chakraborty, 1996), we would expect loci with a low heterozygosity to have larger allele frequency differences than loci with a high heterozygosity, and this is exactly what we observe in Figure 1.

Relationship between average heterozygosity over all populations (He) and largest absolute allele frequency difference (ΔFmax) for STR loci.
Surfing promotes the increase of allele frequencies in the direction of a spatial expansion. Therefore we expect to find more STR alleles with increased frequency in newly colonised areas than alleles with decreased frequency, since the decrease compensating the increase of a single allele will affect several other alleles at a given locus. This excess should be especially pronounced for Africa and America, because they are separated by spatial bottlenecks from the Eurasian continent. As shown in Figures 2 and 3, there is indeed a clear asymmetry in the distribution of STR allele frequency differences between regions. For instance, by considering only alleles with ΔF > 0.3, there are clearly more alleles that increased in frequency outside Africa than there are alleles that decreased in frequency. On the contrary, for East Asia and the Americas, there are more alleles at a higher frequency within these regions (Table S5). Since it is not possible to describe this pattern for diallelic loci like SNPs and indels, we tested for these markers whether the derived alleles show an asymmetry in frequency differences. We actually did not expect to find any asymmetry, as surfing does not discriminate between ancestral and derived alleles. For the indels the derived allele is about equally likely to increase in frequency as it is to decrease in frequency (Table S6). For SNPs however, we find that derived alleles have more often increased than decreased outside Africa for 0.15 < ΔF < 0.5, while we see the reverse situation in America for 0.3 < ΔF < 0.4 (Table S7). No clear pattern occurs for the other two regions (Table S7). This pattern is compatible with surfing, since most derived SNP alleles have low frequencies in Africa and could thus have had more room to increase in frequency by surfing than already frequent alleles.

Comparison of the distribution of allele frequencies between regions. A: Africa vs. rest of the World; B: Eurasia vs. rest of the World; C: East Asia vs. rest of the World; D: America vs. rest of the World. The grey scale in each square is proportional to the fraction of alleles (on a log‐scale) with a given average frequency. The size of the circles within squares is proportional to the number of loci with a given average frequency. Note that each locus is represented here by the allele with the largest frequency difference. Frequencies below 0 indicate that the alleles are not present in the respective group of populations. Note that alleles on the diagonal have equal frequencies in the two groups of populations.

Lod ratio of the number of alleles with a positive frequency difference (#ΔF+) and the number of alleles with a negative frequency difference (#ΔF‐), where positive means a lower frequency in the region of interest and a higher frequency in the rest of the world, as a function of ΔF. A positive lod ratio indicates that more alleles increased than decreased by a given ΔF out of the region of interest. Filled symbols indicate significant lod ratios (p‐value < 0.05, as assessed by a sign test). We only report ΔF categories with more than 10 alleles.
Eberle et al. (2006) found that genic regions are enriched for signals of positive selection compared to non‐genic regions (see also Hinds et al. 2005, Voight et al. 2006, Barreiro et al. 2008). If large ΔF were mainly created by the action of positive selection, it should be especially common close to genes. However, we find the correlation of ΔFmax and distance to the closest gene is only significant (at the 5% level) in three instances: for STR alleles in Eurasia, as well as for SNP alleles in Eurasia and America (Figures S2 and S4). In all three cases the explained variance (R2) is small and the p‐values are above the 1% level. For indels there is no significant correlation between ΔFmax and distance to genes (Figure S3). However, the power to detect selection close to genic regions may be limited here by the lower density of markers than that available in previous genomic studies, which were however based on a much smaller number of populations.
Discussion
We have found an unexpectedly large fraction of loci showing strong differences in allele frequencies between continents in all three datasets. 43% of the indels, 32% of the SNPs and 28% of the STR loci show large frequency differences (ΔFmax > 0.3) between a given geographic region and the rest of the world. A visual inspection of the spatial distribution of some of these allele frequencies indeed reveals striking features (Figure 4), with strong differences between continents, either with very narrow or broader clines, which at first sight is difficult to attribute to pure neutral processes. However, the sheer number of loci showing such striking patterns makes it difficult to believe that these patterns have all been shaped by positive selection, as previously advocated (Evans et al. 2005, Mekel‐Bobrov et al. 2005, Akey et al. 2006, Myles et al. 2008).
There is a clear excess of large ΔF between sub‐Saharan Africa or the Americas and other regions as compared to ΔF between Eurasia or east Asia and other regions (Tables S2‐S4). This is in line with previous genome scan studies, which detected more evidence of recent positive selection in Eurasian and East Asian populations as compared to African populations (Kayser et al. 2003, Akey et al. 2004, Storz et al. 2004, Carlson et al. 2005, Williamson et al. 2007). African populations seem therefore to have a deficit of recent positive selection (but see Hawks et al. 2007), which may be interpreted as evidence that selective pressures in recent times were more prevalent outside of Africa (Akey et al. 2004, Storz et al. 2004). In agreement with this hypothesis, Tang et al. (2007) found more genomic regions potentially influenced by selection when Africa was compared to Eurasian or to Asian populations than in the comparison of Eurasia to Asia. Under a selectionist view, this could be explained by the fact that the Eurasian continent has been colonized only recently and traces of selection would be easier to recognize. However, the populations remaining in Africa have also experienced drastic changes in their environment during the past 50,000 years (deMenocal, 2004), and prominent examples of recent genetic adaptations have been found in this continent as well (e.g. beta‐globin (Hanchard et al. 2007), G6PD (Saunders et al. 2002), or lactose tolerance (Tishkoff et al. 2007)). Like Africa, the Americas are also strongly differentiated from the rest of the World, and here selection would have had little time to operate, especially given the overall small sizes of the populations, leading to large levels of differentiation among Amerindian populations (Wang et al. 2007b).
We believe that demographic factors can better explain the particular differentiation of both Africa and the Americas. These two continents are indeed geographically very isolated from the others, such that some spatial and demographic bottlenecks have certainly occurred during the exit out‐of‐Africa to colonize Eurasia and during the colonization of the Americas from North‐East Asia (see e.g. Fagundes et al. 2007). Moreover, these spatial bottlenecks could have also enhanced the possibility of allelic surfing during subsequent spatial expansions (Travis et al. 2007). Allele surfing could also explain the asymmetry of the STR allele frequency distributions (Figures 2 and 3), since this phenomenon originally described the increase in frequency of rare alleles over large and recently colonized areas (Edmonds et al. 2004, Klopfstein et al. 2006). Therefore, the asymmetries shown in Figures 2 and 3 are expected after a range expansion out‐of‐Africa, as well as into Eurasia, East‐Asia and the Americas.
If large allele frequency differences were mainly driven by positive selection acting on coding regions, one would expect to see a negative relationship between ΔF and the distance between gene and markers. Voight et al. (2006) indeed discovered more signals of selection in genic regions than in non‐genic regions of the genome and Hinds et al. (2005) and Eberle et al. (2006) found that regions of extended linkage disequilibrium are enriched for genic SNPs. When testing for a correlation of allele frequency differences and distance to genes, however, we find only marginally significant results in three cases. We note however, that the relative lower number of loci examined here in a large number of populations is in contrast with previous genome scan studies, where hundreds of thousands of loci were studied in a very few populations. This low marker density may indeed prevent us from obtaining significant results, and it would be interesting to extend our analysis to new databases containing hundreds of thousands of markers (see e.g. Jakobsson et al. 2008, Li et al. 2008). In any case, the fact that markers showing high levels of differentiation between continents appear randomly scattered over the whole genome is more in line with surfing than with positive selection as a cause. It is, however, very likely that we observe the effects of diverse selective and neutral forces and their interaction. Positive selection, genetic drift and allelic surfing mainly lead to increased genetic differences between populations, while balancing selection and migration decrease differentiation. Our results suggest that local adaptation is certainly not the main acting force in promoting these large allele frequency changes between continental regions, but selection could certainly be involved at various loci.
Among the genes that are close to markers with high allele frequency differences between African and non‐African populations, we could identify some that were already signalled as candidates for positive selection in previous studies using different criterion than mere allele frequency differences between continents. These are TCF15 (Storz et al. 2004), KRTAP23–1 (Williamson et al. 2007), PHACTR1 (Williamson et al. 2007), C20orf26 (Williamson et al. 2007), ANTXR2 (Kimura et al. 2007), UTRN (Tang et al. 2007), TYRP1 (Izagirre et al. 2006, Lao et al. 2007), LYST (Izagirre et al. 2006), DMD (Nachman & Crowell, 2000), SEMA4F (Nielsen et al. 2005), and E2F6 (Kayser et al. 2003). It suggests either that markers with geographic differentiation may indeed point to linked selected genes or that previous studies using allele frequency difference as a criterion to identify outlier loci have erroneously mistaken surfing for selection.
Since allele surfing looks very much like a selective sweep (Nielsen et al. 2007, Excoffier & Ray, 2008) it would affect other aspects of genetic diversity than the allele frequency spectrum, like linkage disequilibrium and extended homozygosity (Biswas & Akey, 2006). Previous studies aiming at detecting positively selected loci have attempted to control for past demography, either by 1) explicitly modelling some complex demography (Sabeti et al. 2007, Stajich & Hahn, 2005, Tang et al. 2007, Williamson et al. 2007), 2) by comparing diversity linked to derived or ancestral alleles (Voight et al. 2006), or 3) by contrasting coding to non‐coding regions (Akey et al. 2002, Barreiro et al. 2008). To our knowledge, range expansions have never been used as a null model against which observed patterns were examined, and it is thus unclear (and would be worth examining) how the sensitivity of the first types of approaches would change under such a new null model. As mentioned above, derived and ancestral alleles show different frequencies in Africa (The International HapMap Consortium, 2007, Li et al. 2008) and the result of positive selection differs between new and standing variation (Przeworski et al. 2005, Teshima et al. 2006, Barrett & Schluter, 2008), so that tests based on the comparison of diversity associated to derived and ancestral alleles may indeed be sensitive to allele surfing, simply because these two allele categories have different initial frequencies. The comparison of genic to non‐genic regions may indeed be the approach most robust against past demography. For instance, Barreiro et al. (2008) compared the proportion of loci with a high FST between genic and non‐genic SNPs. They found that the proportion of genic SNPs with an FST>0.65 was about 2.8 fold larger than the proportion of non‐genic SNPs with equally large FST, and they could identify several candidate genes based on this high level of differentiation between populations. However, since this class of high FST SNPs represents only about 0.35% of all genic SNPs, it suggests that most genic regions have not been influenced much by selection. While we find that positive selection is unlikely to have shaped the allele frequency spectrum at most loci, it may certainly have acted on fewer genes than previously believed, and our current results do not allow us to discriminate between the effects of demography and selection for an individual locus. Loci which are candidates for being under positive selection should therefore be more carefully scrutinized to find links between potentially selected alleles and a phenotypic effect (see e.g. Sabeti et al. 2007).
Conclusions
The survey of the HGDP database on human polymorphisms reveals that large allele frequency differences between continental regions are extremely common. Indeed as much as 30% of loci show very large allele frequency differences between continents. These differences are unlikely to have been created by positive selection, but are more likely the result of neutral demographic processes such as the surfing phenomenon. Because the erosion of large allele frequency differences by mutation is slow, even for large mutation rates, the surprisingly large number of strongly differentiated STR alleles also do not need to be explained by the action of positive selection. Africa and the Americas show a much larger extent of differentiation than Eurasia or East Asia, which is certainly due to changes in allele frequencies during the colonisation of the Eurasian and the American continents. Disentangling the effects of selection and neutral demographic processes on genome diversity remains an important challenge of future human evolution studies.
Acknowledgements
Thanks to Montgomery Slatkin for his comments on a previous version of the manuscript, and to Gerald Heckel and Matthieu Foll for stimulating discussions on the subject. We are grateful to Mourad Sahbatou and Sijia Wang for providing information about the genomic location of some of the markers, and to Isabelle Dupanloup for providing help on database issues. This work was supported by a Swiss NSF grant No 3100A0‐112072 to L.E.
Web resources
Noah Rosenberg Laboratory: http://rosenberglab.bioinformatics.med.umich.edu/diversity.html
UCSC Genome Browser: http://genome.ucsc.edu/
Number of times cited: 84
- Ceridwen I. Fraser, Ian D. Davies, David Bryant, Jonathan M. Waters and Carol Thornber, How disturbance and dispersal influence intraspecific structure, Journal of Ecology, 106, 3, (1298-1306), (2017).
- Lauren C. White, Katherine E. Moseby, Vicki A. Thomson, Stephen C. Donnellan and Jeremy J. Austin, Long-term genetic consequences of mammal reintroductions into an Australian conservation reserve, Biological Conservation, 219, (1), (2018).
- Rose Ruiz Daniels, Richard S. Taylor, María Jesús Serra‐Varela, Giovanni G. Vendramin, Santiago C. González‐Martínez and Delphine Grivet, Inferring selection in instances of long‐range colonization: The Aleppo pine (Pinus halepensis) in the Mediterranean Basin, Molecular Ecology, 27, 16, (3331-3345), (2018).
- Katie E Lotterhos, Sam Yeaman, Jon Degner, Sally Aitken and Kathryn A Hodgins, Modularity of genes involved in local adaptation to climate despite physical linkage, Genome Biology, 10.1186/s13059-018-1545-7, 19, 1, (2018).
- Carolina Medina-Gomez, Oscar Lao and Fernando Rivadeneira, Evolution of Complex Traits in Human Populations, Evolutionary Biology: Self/Nonself Evolution, Species and Complex Traits Evolution, Methods and Concepts, 10.1007/978-3-319-61569-1_9, (165-186), (2017).
- Ke Li, Michael H. Kohn, Songmei Zhang, Xinrong Wan, Dazhao Shi and Deng Wang, The colonization and divergence patterns of Brandt’s vole (Lasiopodomys brandtii) populations reveal evidence of genetic surfing, BMC Evolutionary Biology, 17, 1, (2017).
- Juliana Dal‐Ri Lindenau, Francisco Mauro Salzano, Ana Magdalena Hurtado, Kim R. Hill, Maria Luiza Petzl‐Erler, Luiza Tamie Tsuneto and Mara Helena Hutz, Variability of innate immune system genes in Native American populations—relationship with history and epidemiology, American Journal of Physical Anthropology, 159, 4, (722-728), (2015).
- Felix M. Key, Qiaomei Fu, Frédéric Romagné, Michael Lachmann and Aida M. Andrés, Human adaptation and population differentiation in the light of ancient genomes, Nature Communications, 7, (10775), (2016).
- Stephan Peischl, Isabelle Dupanloup, Lars Bosshard and Laurent Excoffier, Genetic surfing in human populations: from genes to genomes, Current Opinion in Genetics & Development, 41, (53), (2016).
- Grant H. Pogson, Studying the genetic basis of speciation in high gene flow marine invertebrates, Current Zoology, 62, 6, (643), (2016).
- Sefayet Karaca, Mehmet Karaca, Tomris Cesuroglu, Sema Erge and Renato Polimanti, GSTM1, GSTP1, and GSTT1 genetic variability in Turkish and worldwide populations, American Journal of Human Biology, 27, 3, (310-316), (2014).
- Garth D. Ehrlich, Developing the Scientific Infrastructure to Produce Ethnogenetically-Specific Personalized Medicine, Genetic Testing and Molecular Biomarkers, 19, 9, (465), (2015).
- Renato Polimanti, Can Yang, Hongyu Zhao and Joel Gelernter, Dissecting ancestry genomic background in substance dependence genome-wide association studies, Pharmacogenomics, 16, 13, (1487), (2015).
- Bruno Guinand, Nolwenn Quéré, Erick Desmarais, Jacques Lagnel, Costas S. Tsigenopoulos and François Bonhomme, From the laboratory to the wild: salinity-based genetic differentiation of the European sea bass (Dicentrarchus labrax) using gene-associated and gene-independent microsatellite markers, Marine Biology, 10.1007/s00227-014-2602-8, 162, 3, (515-538), (2015).
- Stephan Peischl, Mark Kirkpatrick and Laurent Excoffier, Expansion Load and the Evolutionary Dynamics of a Species Range, The American Naturalist, 185, 4, (E81), (2015).
- Emily E. Puckett, Paul D. Etter, Eric A. Johnson and Lori S. Eggert, Phylogeographic Analyses of American Black Bears (Ursus americanus) Suggest Four Glacial Refugia and Complex Patterns of Postglacial Admixture, Molecular Biology and Evolution, 32, 9, (2338), (2015).
- Sharlee Climer, Alan R. Templeton and Weixiong Zhang, Human gephyrin is encompassed within giant functional noncoding yin–yang sequences, Nature Communications, 6, 1, (2015).
- Renato Polimanti, Sara Piacentini, Andrea Iorio, Flavio De Angelis, Andrey Kozlov, Andrea Novelletto and Maria Fuciarelli, Haplotype differences for copy number variants in the 22q11.23 region among human populations: a pigmentation-based model for selective pressure, European Journal of Human Genetics, 23, 1, (116), (2015).
- Raymond Kuo, Dominic D.P. Johnson and Monica Duffy Toft, Correspondence: Evolution and Territorial Conflict, International Security, 39, 3, (190), (2015).
- Sefayet Karaca, Tomris Cesuroglu, Mehmet Karaca, Sema Erge and Renato Polimanti, Genetic diversity of disease-associated loci in Turkish population, Journal of Human Genetics, 60, 4, (193), (2015).
- Philine G. D. Feulner, Frédéric J. J. Chain, Mahesh Panchal, Yun Huang, Christophe Eizaguirre, Martin Kalbe, Tobias L. Lenz, Irene E. Samonte, Monika Stoll, Erich Bornberg-Bauer, Thorsten B. H. Reusch, Manfred Milinski and Jianzhi Zhang, Genomics of Divergence along a Continuum of Parapatric Population Differentiation, PLOS Genetics, 11, 2, (e1004966), (2015).
- Marc Pybus, Pierre Luisi, Giovanni Marco Dall'Olio, Manu Uzkudun, Hafid Laayouni, Jaume Bertranpetit and Johannes Engelken, Hierarchical boosting: a machine-learning framework to detect and classify hard selective sweeps in human populations, Bioinformatics, (btv493), (2015).
- Samuel K Handelman, Michal Seweryn, Ryan M Smith, Katherine Hartmann, Danxin Wang, Maciej Pietrzak, Andrew D Johnson, Andrzej Kloczkowski and Wolfgang Sadee, Conditional entropy in variation-adjusted windows detects selection signatures associated with expression quantitative trait loci (eQTLs), BMC Genomics, 16, Suppl 8, (S8), (2015).
- Sylvain Antoniazza, Ricardo Kanitz, Samuel Neuenschwander, Reto Burri, Arnaud Gaigher, Alexandre Roulin and Jérôme Goudet, Natural selection in a postglacial range expansion: the case of the colour cline in the European barn owl, Molecular Ecology, 23, 22, (5508-5523), (2014).
- Vitor Sousa, Stephan Peischl and Laurent Excoffier, Impact of range expansions on current human genomic diversity, Current Opinion in Genetics & Development, 29, (22), (2014).
- Renato Polimanti, Andrea Iorio, Sara Piacentini, Dario Manfellotto and Maria Fuciarelli, Human pharmacogenomic variation of antihypertensive drugs: from population genetics to personalized medicine, Pharmacogenomics, 15, 2, (157), (2014).
- Vikram E. Chhatre, Om P. Rajora and Nadia Singh, Genetic Divergence and Signatures of Natural Selection in Marginal Populations of a Keystone, Long-Lived Conifer, Eastern White Pine (Pinus strobus) from Northern Ontario, PLoS ONE, 9, 5, (e97291), (2014).
- Diddahally R. Govindaraju, Opportunity for Selection in Human Health, , 10.1016/B978-0-12-800149-3.00001-9, (1-70), (2014).
- Johannes Engelken, Elena Carnero-Montoro, Marc Pybus, Glen K. Andrews, Carles Lalueza-Fox, David Comas, Israel Sekler, Marco de la Rasilla, Antonio Rosas, Mark Stoneking, Miguel A. Valverde, Rubén Vicente, Elena Bosch and Joshua M. Akey, Extreme Population Differences in the Human Zinc Transporter ZIP4 (SLC39A4) Are Explained by Positive Selection in Sub-Saharan Africa, PLoS Genetics, 10, 2, (e1004128), (2014).
- Tomer Gueta, Alan R. Templeton and Shirli Bar-David, Development of genetic structure in a heterogeneous landscape over a short time frame: the reintroduced Asiatic wild ass, Conservation Genetics, 15, 5, (1231), (2014).
- Marc Pybus, Giovanni M. Dall’Olio, Pierre Luisi, Manu Uzkudun, Angel Carreño-Torres, Pavlos Pavlidis, Hafid Laayouni, Jaume Bertranpetit and Johannes Engelken, 1000 Genomes Selection Browser 1.0: a genome browser dedicated to signatures of natural selection in modern humans, Nucleic Acids Research, 42, D1, (D903), (2014).
- Sara Piacentini, Renato Polimanti, Flavio Angelis, Andrea Iorio and Maria Fuciarelli, Phenotype versus Genotype Methods for Copy Number Variant Analysis of Glutathione S‐Transferases M1, Annals of Human Genetics, 77, 5, (409-415), (2013).
- Sarah Catherine Hill, Talal Ramadan Mohammad and Toomas Kivisild, Brief communication: Effect of nomadic subsistence practices on lactase persistence associated genetic variation in Kuwait, American Journal of Physical Anthropology, 152, 1, (140-144), (2013).
- Thomas A. White, Sarah E. Perkins, Gerald Heckel and Jeremy B. Searle, Adaptive evolution during an ongoing range expansion: the invasive bank vole (yodes glareolus) in Ireland, Molecular Ecology, 22, 11, (2971-2985), (2013).
- Nicolas Bierne, Denis Roze and John J. Welch, Pervasive selection or is it…? why are FST outliers sometimes so frequent?, Molecular Ecology, 22, 8, (2061-2064), (2013).
- Benjamin M. Peter and Montgomery Slatkin, DETECTING RANGE EXPANSIONS FROM GENETIC DATA, Evolution, 67, 11, (3274-3289), (2013).
- Audrey Rohfritsch, Nicolas Bierne, Pierre Boudry, Serge Heurtebise, Florence Cornette and Sylvie Lapègue, Population genomics shed light on the demographic and adaptive histories of European invasion in the Pacific oyster, Crassostrea gigas, Evolutionary Applications, 6, 7, (1064-1078), (2013).
- Barry I. Freedman, Jasmin Divers and Nicholette D. Palmer, Population Ancestry and Genetic Risk for Diabetes and Kidney, Cardiovascular, and Bone Disease: Modifiable Environmental Factors May Produce the Cures, American Journal of Kidney Diseases, 10.1053/j.ajkd.2013.05.024, 62, 6, (1165-1175), (2013).
- Clint Rhode, Jessica Vervalle, Aletta E. Bester-van der Merwe and Rouvay Roodt-Wilding, Detection of molecular signatures of selection at microsatellite loci in the South African abalone (Haliotis midae) using a population genomic approach, Marine Genomics, 10, (27), (2013).
- Joseph Lachance and Sarah A. Tishkoff, Population Genomics of Human Adaptation, Annual Review of Ecology, Evolution, and Systematics, 44, 1, (123), (2013).
- Laia Bassaganyas, Eva Riveira-Muñoz, Manel García-Aragonés, Juan R González, Mario Cáceres, Lluís Armengol and Xavier Estivill, Worldwide population distribution of the common LCE3C-LCE3B deletion associated with psoriasis and other autoimmune disorders, BMC Genomics, 10.1186/1471-2164-14-261, 14, 1, (261), (2013).
- Meghan L. Meyer, Baldwin M. Way and Naomi I. Eisenberger, Broadening the Scope of Cultural Neuroscience, Psychological Inquiry, 24, 1, (47), (2013).
- Renato Polimanti, Maria Fuciarelli, Giovanni Destro-Bisol and Cinzia Battaggia, Functional diversity of the glutathione peroxidase gene family among human populations: implications for genetic predisposition to disease and drug response, Pharmacogenomics, 14, 9, (1037), (2013).
- Jonathan M. Waters, Ceridwen I. Fraser and Godfrey M. Hewitt, Founder takes all: density-dependent processes structure biodiversity, Trends in Ecology & Evolution, 28, 2, (78), (2013).
- Jamie R. McEwen, Jana C. Vamosi, Sean M. Rogers and Daniel Ortiz-Barrientos, Natural Selection and Neutral Evolution Jointly Drive Population Divergence between Alpine and Lowland Ecotypes of the Allopolyploid Plant Anemone multifida (Ranunculaceae), PLoS ONE, 8, 7, (e68889), (2013).
- Dominique Buehler, Bénédicte N. Poncet, Rolf Holderegger, Stéphanie Manel, Pierre Taberlet and Felix Gugerli, An outlier locus relevant in habitat-mediated selection in an alpine plant across independent regional replicates, Evolutionary Ecology, 27, 2, (285), (2013).
- Renato Polimanti, Marco Di Girolamo, Dario Manfellotto and Maria Fuciarelli, Functional variation of thetransthyretingene among human populations and its correlation with amyloidosis phenotypes, Amyloid, 20, 4, (256), (2013).
- Thomas A. White, Sarah E. Perkins and Alison Dunn, The ecoimmunology of invasive species, Functional Ecology, 26, 6, (1313-1323), (2012).
- KARIN MATTERSDORFER, STEPHAN KOBLMÜLLER and KRISTINA M. SEFC, AFLP genome scans suggest divergent selection on colour patterning in allopatric colour morphs of a cichlid fish, Molecular Ecology, 21, 14, (3531-3544), (2012).
- Tamara Hofer, Matthieu Foll and Laurent Excoffier, Evolutionary forces shaping genomic islands of population differentiation in humans, BMC Genomics, 13, 1, (107), (2012).
- Blandine Patillon, Pierre Luisi, Hélène Blanché, Etienne Patin, Howard M. Cann, Emmanuelle Génin, Audrey Sabbagh and Yury E. Khudyakov, Positive Selection in the Chromosome 16 VKORC1 Genomic Region Has Contributed to the Variability of Anticoagulant Response in Humans, PLoS ONE, 7, 12, (e53049), (2012).
- Renato Polimanti, Sara Piacentini, Dario Manfellotto and Maria Fuciarelli, Human genetic variation of CYP450 superfamily: analysis of functional diversity in worldwide populations, Pharmacogenomics, 13, 16, (1951), (2012).
- NICOLAS BIERNE, JOHN WELCH, ETIENNE LOIRE, FRANÇOIS BONHOMME and PATRICE DAVID, The coupling hypothesis: why genome scans may fail to map local adaptation genes, Molecular Ecology, 20, 10, (2044-2072), (2011).
- PABLO OROZCO‐terWENGEL, JUKKA CORANDER and CHRISTIAN SCHLÖTTERER, Genealogical lineage sorting leads to significant, but incorrect Bayesian multilocus inference of population structure, Molecular Ecology, 20, 6, (1108-1121), (2011).
- Keyue Ding and Iftikhar J Kullo, Geographic differences in allele frequencies of susceptibility SNPs for cardiovascular disease, BMC Medical Genetics, 12, 1, (2011).
- M. Ramsay, C. T. Tiemessen, A. Choudhury and H. Soodyall, Africa: the next frontier for human disease gene discovery?, Human Molecular Genetics, 20, R2, (R214), (2011).
- P. Gerbault, A. Liebert, Y. Itan, A. Powell, M. Currat, J. Burger, D. M. Swallow and M. G. Thomas, Evolution of lactase persistence: an example of human niche construction, Philosophical Transactions of the Royal Society B: Biological Sciences, 366, 1566, (863), (2011).
- Y. Shimada, T. Shikano and J. Merila, A High Incidence of Selection on Physiologically Important Genes in the Three-Spined Stickleback, Gasterosteus aculeatus, Molecular Biology and Evolution, 28, 1, (181), (2011).
- T Münkemüller, M J Travis, O J Burton, K Schiffers and K Johst, Density-regulated population dynamics and conditional dispersal alter the fate of mutations occurring at the front of an expanding population, Heredity, 106, 4, (678), (2011).
- Daniel Gomez-Uchida, James E Seeb, Matt J Smith, Christopher Habicht, Thomas P Quinn and Lisa W Seeb, Single nucleotide polymorphisms unravel hierarchical divergence and signatures of selection among Alaskan sockeye salmon (Oncorhynchus nerka) populations, BMC Evolutionary Biology, 11, 1, (2011).
- Laure Ségurel, Sophie Lafosse, Evelyne Heyer and Renaud Vitalis, Frequency of the AGT Pro11Leu Polymorphism in Humans: does Diet Matter?, Annals of Human Genetics, 74, 1, (57-64), (2009).
- Eugene E. Harris, Nonadaptive processes in primate and human evolution, American Journal of Physical Anthropology, 143, S51, (13-45), (2010).
- JOÃO NEIVA, GARETH A. PEARSON, MYRIAM VALERO and ESTER A. SERRÃO, Surfing the wave on a borrowed board: range expansion and spread of introgressed organellar genomes in the seaweed Fucus ceranoides L., Molecular Ecology, 19, 21, (4812-4822), (2010).
- L. LACEY KNOWLES and DIEGO F. ALVARADO‐SERRANO, Exploring the population genetic consequences of the colonization process with spatio‐temporally explicit models: insights from coupled ecological, demographic and genetic models in montane grasshoppers, Molecular Ecology, 19, 17, (3727-3745), (2010).
- STEPHEN R. KELLER, MATTHEW S. OLSON, SALIM SILIM, WILLIAM SCHROEDER and PETER TIFFIN, Genomic diversity, population structure, and migration following rapid range expansion in the Balsam Poplar, Populus balsamifera, Molecular Ecology, 19, 6, (1212-1226), (2010).
- Judith R. Miller, ORIGINAL ARTICLE: Survival of mutations arising during invasions, Evolutionary Applications, 3, 2, (109-121), (2010).
- Nicolas Bierne, THE DISTINCTIVE FOOTPRINTS OF LOCAL HITCHHIKING IN A VARIED ENVIRONMENT AND GLOBAL HITCHHIKING IN A SUBDIVIDED POPULATION, Evolution, 64, 11, (3254-3272), (2010).
- Esma Ucisik-Akkaya, Charronne F. Davis, Clara Gorodezky, Carmen Alaez and M. Tevfik Dorak, HLA complex-linked heat shock protein genes and childhood acute lymphoblastic leukemia susceptibility, Cell Stress and Chaperones, 10.1007/s12192-009-0161-6, 15, 5, (475-485), (2009).
- T. Shikano, J. Ramadevi and J. Merila, Identification of Local- and Habitat-Dependent Selection: Scanning Functionally Important Genes in Nine-Spined Sticklebacks (Pungitius pungitius), Molecular Biology and Evolution, 27, 12, (2775), (2010).
- Tuuli Lappalainen, Elina Salmela, Peter M Andersen, Karin Dahlman-Wright, Pertti Sistonen, Marja-Liisa Savontaus, Stefan Schreiber, Päivi Lahermo and Juha Kere, Genomic landscape of positive natural selection in Northern European populations, European Journal of Human Genetics, 18, 4, (471), (2010).
- Laurent Excoffier, Matthieu Foll and Rémy J. Petit, Genetic Consequences of Range Expansions, Annual Review of Ecology, Evolution, and Systematics, 40, 1, (481), (2009).
- David López Herráez, Marc Bauchet, Kun Tang, Christoph Theunert, Irina Pugach, Jing Li, Madhusudan R. Nandineni, Arnd Gross, Markus Scholz, Mark Stoneking and John Hawks, Genetic Variation and Recent Positive Selection in Worldwide Human Populations: Evidence from Nearly 1 Million SNPs, PLoS ONE, 4, 11, (e7888), (2009).
- Axel M. Hillmer, Jan Freudenberg, Sean Myles, Stefan Herms, Kun Tang, David A. Hughes, Felix F. Brockschmidt, Yijun Ruan, Mark Stoneking and Markus M. Nöthen, Recent positive selection of a human androgen receptor/ectodysplasin A2 receptor haplotype and its relationship to male pattern baldness, Human Genetics, 126, 2, (255), (2009).
- John Novembre and Anna Di Rienzo, Spatial patterns of variation due to natural selection in humans, Nature Reviews Genetics, 10, 11, (745), (2009).
- Roberta L. Millstein, Populations as Individuals, Biological Theory, 4, 3, (267), (2009).
- L Excoffier, T Hofer and M Foll, Detecting loci under selection in a hierarchically structured population, Heredity, 103, 4, (285), (2009).
- Diogo Meyer and Eugene E Harris, Nonadaptive Genetic Change in Human and Primate Evolution, eLS, (2013).
- Oscar Lao and Manfred Kayser, Human Relationships Inferred from Genetic Variation, eLS, (2009).
- Jibril Hirbo, Reconstructing Human History Using Autosomal, Y‐Chromosomal and Mitochondrial Markers, eLS, (1-9), (2015).
- Eugene E Harris, Gene Evolution and Human Adaptation, eLS, (2013).
- Roderick B. Gagne, M. Timothy Tinker, Kyle D. Gustafson, Katherine Ralls, Shawn Larson, L. Max Tarjan, Melissa A. Miller and Holly B. Ernest, Measures of effective population size in sea otters reveal special considerations for wide‐ranging species, Evolutionary Applications, , (2018).
- Elizabeth Heppenheimer, Daniela S. Cosio, Kristin E. Brzeski, Danny Caudill, Kyle Van Why, Michael J. Chamberlain, Joseph W. Hinton and Bridgett vonHoldt, Demographic history influences spatial patterns of genetic diversityin recently expanded coyote (Canis latrans) populations, Heredity, 10.1038/s41437-017-0014-5, (2017).
- Michael W. Hart, Daryn A. Stover, Vanessa Guerra, Sahar V. Mozaffari, Carole Ober, Carina F. Mugal and Ingemar Kaj, Positive selection on human gamete-recognition genes, PeerJ, 10.7717/peerj.4259, 6, (e4259), (2018).
- Carlos Eduardo G. Amorim, Victor Acuña-Alonzo, Francisco M. Salzano, Maria Cátira Bortolini, Tábita Hünemeier and Francesc Calafell, Differing Evolutionary Histories of the ACTN3*R577X Polymorphism among the Major Human Geographic Groups, PLOS ONE, 10.1371/journal.pone.0115449, 10, 2, (e0115449), (2015).



).

