SEARCH

SEARCH BY CITATION

Keywords:

  • Ameriindians;
  • mtDNA;
  • Y-chromosome;
  • X-chromosome;
  • STRs

Summary

  1. Top of page
  2. Summary
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

We report an integrated analysis of nuclear (autosomal, X- and Y-chromosome) short tandem repeat (STR) data and mtDNA D-loop sequences obtained in the same set of 22 Native populations from across the Americas. A north to south gradient of decreasing population diversity was observed, in agreement with a settlement of the Americas from the extreme northwest of the continent. This correlation is stronger with “least cost distances,” which consider the coasts as facilitators of migration. Continent-wide estimates of population structure are highest for the Y-chromosome and lowest for the autosomes, consistent with the effective size of the different marker systems examined. Population differentiation is highest in East South America and lowest in Meso America and the Andean region. Regional analyses suggest a deviation from mutation–drift equilibrium consistent with population expansion in Meso America and the Andes and population contraction in Northwest and East South America. These data hint at an early divergence of Andean and non-Andean South Americans and at a contrasting demographic history for populations from these regions.


Introduction

  1. Top of page
  2. Summary
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

Contrasting patterns of diversity between different genetic marker systems can provide refined insights into aspects of human population evolution. For instance, such analyses have been used to evaluate possible differences in migration rates between men and women throughout human evolution (reviewed in Wilkins, 2006 and Segurel et al., 2008). In another application, a recent study of autosomal and X chromosome data has led to the proposal of a more severe male than female bottleneck during the initial migration of anatomically modern humans out of Africa (Keinan et al., 2009). In the Americas, continent-wide Native population surveys have been carried out independently with mtDNA (Torroni et al., 1993; Merriwether et al., 1995; Forster et al., 1996; Bonatto & Salzano, 1997; Fagundes et al., 2008), Y-chromosome (Lell et al., 2002; Bortolini et al., 2003), autosomal (Salzano & Callegari-Jacques, 1988; Cavalli-Sforza et al., 1994; Wang et al., 2007) and, more recently, X-chromosome markers (Bourgeois et al., 2009; Wang et al., 2010). However, other than regional studies (Mesa et al., 2000) or continental studies combining data for different population samples (Bortolini et al., 2003) there has not been a systematic, continent-wide, analysis of genetic diversity across marker systems on the same Native American population samples.

We recently performed a genome-wide survey of autosomal diversity in the Americas with data for 678 STRs typed in 22 Native populations from North, Central, and South America (Wang et al., 2007). Here we expand that study with data for 38 X-chromosome STRs, Y-chromosome markers (16 STRs and three biallelic polymorphisms), and mtDNA D-loop sequences obtained in the same population samples. These data emphasize the existence of a north to south gradient in population genetic diversity, consistent with the serial founder model of human population expansion (Ramachandran et al., 2005) and with a colonization of the Americas from the extreme northwest of the continent; the coasts acting as facilitators during this process. Contrasting diversity across marker systems points to a differing demographic history across South America, with Andean populations diverging early in the settlement of the region and having larger long-term effective sizes than non-Andean populations.

Methods

  1. Top of page
  2. Summary
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

Samples

A total of 396 individuals from 22 Native American populations were examined (Fig. 1). These were the same individuals and populations examined in Wang et al. (2007), including 114 males and 282 females. For several analyses these populations were divided into five groups based mainly on geographic/linguistic grounds: North America (Cree and Ojibwa), Meso America (Mixtec, Mixe, Zapotec, and Kaqchikel), Northwest South America (including Lower Central America: Embera, Guaymi, Arhuaco, Zenu, Wayuu, Waunana, Kogi, and Cabecar), the Andes (Inga, Quechua, Huilliche, and Aymara), and East South America (Aché, Guaraní, Ticuna, and Kaingang).

image

Figure 1. Approximate location of the sampling site for the populations examined here. Color codes indicate for each population its affiliation to a major Native American linguistic stock, following the classification of Ruhlen (Ruhlen, 1991).

Download figure to PowerPoint

mtDNA Sequencing

DNA amplification and sequencing were based on the methods described in Brandstatter et al. (2004). The entire mtDNA control region (D-loop) was amplified with primers: F15878 (5′-AAATGGGCCTGTCCTTGTAG-3′) and R 599 (5′-TTG AGG AGG TAA GCT ACA TA-3′). Each 20 μl PCR reaction consisted of 2 μl Bioline 10× reaction buffer, 0.8 μl 50 mM MgCl2, 0.4 μl 10 μM dNTP mix, 0.4 μl 10 μM F/R primers, and 0.4 μl BiolineTaq (Bioline Ltd., London, UK). Thermal cycler conditions were: 95°C for 10 min and then 36 cycles of 94°C for 1 min, 56°C for 1 min, and 72°C for 2 min. PCR products were purified and sequenced with BigDye™ Terminator v3.1 cycle sequencing kit (Applied BioSystems, Foster City, CA). Each sequencing reaction, made up to a total volume of 11 μl, contained 2.07 μl of 5× sequencing buffer, 0.25 μl of BigDye™ Terminator v3.1 ready reaction mix, 0.18 μl of the sequencing primer, and 2 μl of purified PCR product. Sequencing was carried out with 25 cycles of 96°C for 10 sec, 55°C for 5 sec, and 60°C for 4 min. Both forward and reverse strands were sequenced, providing at least two independent readings for each nucleotide position. Primers used for sequencing included the two initial PCR primers as well as primers: F 16190 (5′-CCC CAT GCT TAC AAG CAA GT-3′), F15 (5′-CAC CCT ATT AAC CAC TCA CG-3′), R274 (5′-TGT GTG GAA AGT GGC TGT GC-3′), R16400 (5′-GTC AAG GGA CCC CTA TCT GA-3′), and R16175 (5′-TGG ATT GGG TTT TTA TGT A-3′). After raw data editing, the nucleotide sequences were analyzed from position 16021 to 16569 and from position 1 to 499 of the mtDNA control region (D-loop). All sequence coordinates used followed the revised Cambridge Reference Sequence (rCRS) (Anderson et al., 1981; Chinnery et al., 1999).

X-Chromosome STRs

The following 38 X chromosome STRs were genotyped by the Marshfield Mammalian Genotyping Service (http://research.marshfieldclinic.org/genetics/home/index.asp): GATA52B03, GATA124B04, AGAT144, AFMA184WF1P, GATA175D03, TATC052, ATA28C05, GATA124E07, GATA186D06, ATCT057M, GATA027M, AFMA124XD9, GATA69C12, GATA144D04, AFM276XF5P, GATA72E05M, AGAT104M, GATA31D10M, GATA31F01P, TAGA017, ATA31E12, GATA10C11, GATA172D05, GATA48H04, AFM248WE5, GATA165B12P, GATA198A10P, ATCT003, AFMA046WB9, AAAT112P, GATA31E08, TATC043, 224ZG11, AFMA121ZE5, TTTA062, GGAT3F08, GATA189B04P, and ATA71D03M.

Y-Chromosome Markers

For male individuals, three Y-SNPs (M3, M19, and M242) were genotyped as described previously (Karafet et al., 1999; Ruiz-Linares et al., 1999; Seielstad et al., 2003). In addition, data for 18 Y STRs were also obtained. The following seven markers were genotyped at the Marshfield Mammalian Genotyping Service: DYS19, DYS388, DYS389 (GATA30F10L, GATA30F10LA), DYS390, DYS391, and DYS395. An additional 11 markers were genotyped in two sets by multiplex PCR. The first set comprised markers: DYS447, DYS448, DYS450, DYS456, and DYS458 and the second set comprised markers: DYS437, DYS438, DYS439, DYS426, DYS460 (A7.1), and H4. PCRs were performed using the Hotstart Taq system (Promega, London, UK), at total volumes of 20 μl (for the 5-plex panel) or 20.6 μl (for the 6-plex panel). For the 5-plex panel, each PCR reaction consisted of 10 μl Hotstart Taq Master Mix, 8 μl Primer mix (final concentration for each primer was 0.4 μM), 1 μl water, and 1μl template DNA. For the 6-plex panel, each PCR reaction consisted of 10 μl Hotstart Taq Master Mix, 9.6 μl Primer mix, and 1μl genomic DNA. After denaturing for 10 min at 95°C, amplification was carried out with 28 cycles of: 94°C for 1 min, 55°C for 1 min, then 72°C for 1 min. The sequence of the PCR primers used for each marker is provided in Table S1.

Data Analysis

Population diversity, structure and demography

The average unbiased gene diversity across loci for each population and each geographic region was computed using ARLEQUIN v3.1 (Excoffier et al., 2005). The correlation between the ranks of genetic diversities calculated from the different marker systems was evaluated with Kendall's coefficient of concordance (Siegel, 1956). Continental and regional FST estimates were obtained using POWERMARKER (Liu & Muse, 2005) with standard errors calculated by bootstrap and significance of differences between estimates evaluated using a t-test. STRUCTURE 2.2 was used to evaluate clustering of individuals with the Bayesian approach of Pritchard et al. (2000) and Falush et al. (2003) and employing the same analysis parameters as in Wang et al. (2007). Results were displayed using DISTRUCT 1.1 (Rosenberg, 2004). Based on the X-STR data, a principal component analysis (PCA) was performed, using the SPSS package, on the pairwise population distance matrix, calculated with POWERMARKER, using the DA distance (Nei & Chesser, 1983). Fu's Fs test was applied to the mtDNA D-loop sequence data using ARLEQUIN v3.1 (Excoffier et al., 2005). The program BOTTLENECK (Cornuet & Luikart, 1996) was run on the X-STR data with 1000 iterations per locus.

Geographic computations

Geographic coordinates for the populations were taken from Wang et al. (2007). Initially, distances between populations were computed using simple great arc routes, with obligatory waypoints as specified in Wang et al. (2007). In addition, least-cost distances between the Bering Strait and population locations were computed using PATHMATRIX (Ray, 2005) following the approach described in detail in Wang et al. (2007). Briefly, these distances are based on least-cost paths computed on the basis of a spatial cost map incorporating landscape components. For instance, at a coastal/inland relative cost of 1:1 (i.e., a ‘‘uniform’’ cost) the only spatial constraints are the boundaries of the continental landmasses. Since we wanted to evaluate the role of coastlines as corridors of migration, we also compared the following coastal/inland relative costs: 1:2, 1:5, 1:10, 1:20, 1:30, 1:40, 1:50, 1:100, 1:200, 1:300, 1:400, and 1:500. Computations were performed on a Lambert azimuthal equal area projection of the American landmass (central meridian 80°W, reference latitude 10°N) divided into a grid of 100 km2 square cells. Pearson's correlation coefficient (r) was estimated between the levels of gene diversity (mean expected heterozygosity (He)) and the great arc (or least cost) population distances from the Bering Strait.

Results

  1. Top of page
  2. Summary
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

Within-Population Diversity

Table 1 shows the genetic diversity for the different marker systems tested in the set of Native American populations examined here and by Wang et al. (2007). A positive correlation is observed for all pairwise comparisons across marker systems, with the strongest correlation seen between autosomal and X-chromosome STRs (Kendall's tau = 0.77) and the weakest between mtDNA and Y-chromosome STRs (Kendall's tau = 0.01). A negative correlation (r=−0.43) of population genetic diversity with (great arc) distance from the Bering Strait has been previously reported for the autosomal STRs (Wang et al., 2007). Diversity at the additional marker systems examined here shows evidence of the same trend, with similarly negative correlations with distance from the Bering Strait being observed for X-STRs and mtDNA: (r(X)=−0.413; r(mtDNA)=−0.421). A somewhat lower negative correlation is observed for Y-STRs (r=−0.128).

Table 1.  Native American population diversity estimated with nuclear STRs and mtDNA D-loop sequences
PopulationmtDNA D-Loop678 A-STRs38 X-STRs18 Y-STRs
Sample size (N)Nucl divSESample size (2N)HASESample Size (N)HXSESample Size (N)HYSE
  1. N/A not estimated due to sample size.

Aché110.00150.0011380.4820.205140.4420.20690.0680.232
Arhuaco160.00350.0021340.6190.145130.5020.16860.3330.000
Aymara170.00900.0048360.6610.132120.6340.12890.4380.266
Cabecar150.00930.0050400.6220.146140.6090.15590.3320.244
Cree110.01040.0058360.6950.115100.6700.1092N/AN/A
Embera90.00530.0032220.6160.155160.5760.17140.6670.000
Guaraní80.00560.0034200.6440.144180.6060.1781N/AN/A
Guaymi130.00820.0045360.5830.174130.5230.19690.3950.306
Huilliche200.00930.0050400.6670.119150.6380.13470.2570.247
Inga160.00920.0050340.6400.140120.5950.1543N/AN/A
Kaingang2N/AN/A140.6230.17860.5430.1801N/AN/A
Kaqchikel160.00870.0047240.6620.138140.6860.12040.2870.298
Kogi160.00810.0044340.5600.175120.5170.1790N/AN/A
Mixe200.01010.0054400.6420.136160.6310.11770.1050.180
Mixtec170.00960.0051400.6460.141150.6390.15260.2720.282
Ojibwa160.00950.0051400.6890.115170.6450.15240.4070.262
Quechua180.00980.0052400.6710.123180.6460.1321N/AN/A
Ticuna120.00920.0051700.6030.156270.5540.161160.2930.256
Waunana180.00800.0043400.6100.156140.5590.14840.2130.277
Wayuu180.01140.0060340.6700.12570.6530.15870.5600.288
Zapotec190.01140.0060380.6680.138170.6580.1051N/AN/A
Zenu150.00540.0031360.6390.142130.5850.15240.3430.291

Consistent with the analyses of autosomal STRs (Wang et al., 2007), the geographic distance-genetic diversity correlations increase for mtDNA and the X-chromosome when using least-cost paths that consider the coasts as facilitators of migration (Fig. 2). Interestingly, the highest correlation is seen for a coastal/inland cost ratio similar to the one observed previously with the autosomal STR dataset (∼1:10) (Wang et al., 2007).

image

Figure 2. R2 (square of the correlation) between population diversity (estimated as gene diversity for X-STRs—top and as nucleotide diversity for mtDNA—bottom) and distance from the Bering Strait. Correlations were evaluated for great arc distances as well as for least-cost distances at a range of coastal/inland relative costs (1:1 to 1:500). Correlations significant at the 0.05 level are indicated by filled symbols.

Download figure to PowerPoint

Table 2 contrasts the mean genetic diversity across markers in different parts of the Americas. As expected, based on differences in ploidy levels, diversity is highest for the autosomes, intermediate for the X-chromosome, and lowest for the Y-chromosome. When comparing the different geographic regions, Eastern South America consistently shows the lowest genetic diversity across all markers systems.

Table 2.  Genetic diversity across regions in the Americas estimated with different marker systems
 A-STRsX-STRsY-STRsmtDNA
NHASENHXSENHYSENNucl divSE
  1. N = number of chromosomes

  2. North America: Cree, Ojibwa.

  3. Meso America: Mixtec, Mixe, Zapotec, Kaqchikel.

  4. Northwest South America (including Lower Central America): Embera, Guaymi, Arhuaco, Zenu, Wayuu, Waunana, Kogi, Cabecar.

  5. Andes: Inga, Quechua, Huilliche and Aymara.

  6. East South America: Aché, Guaraní, Ticuna, Kaingang.

North Am.760.6990.105270.6570.12760.4650.275270.01000.0052
Meso Am.1420.6650.123620.6630.110180.3670.250720.01080.0055
Northwest S. Am.2760.6610.121940.6260.131430.4820.2341200.01020.0052
Andes1500.6720.114570.6450.117200.4450.218710.00990.0050
East S. Am.1420.6330.135560.6030.144270.3350.273330.00930.0048

Population Structure

Continent-wide FST estimates considering the set of 15 populations for which data were available for at least four Y-chromosomes are: F(A)ST= 0.068, F(X)ST= 0.094, F(mtDNA)ST= 0.256, and F(Y)ST= 0.390 (all pairwise comparisons are significantly different at p < 0.05). The corresponding regional FST values are shown in Table 3. In all regions FST is highest for the Y-chromosome followed by mtDNA. In Northwest and East South America F(X)ST is higher than F(A)ST, while in the Andes and Meso America the opposite is observed. Comparing regions, the highest population differentiation for all types of markers is seen in Eastern South America. Northwest South America also shows an increased differentiation, relative to Meso America and the Andes (except for the Y-chromosome, which shows a higher differentiation in Meso America).

Table 3. FST within regions in the Americas estimated with different marker systems
 A-STRsX-STRsY-STRsmtDNA
  1. Calculations were done across all marker sets using the same set of 15 populations for which data for at least 4 Y-chromosomes were available. All estimates across marker systems within a geographic region are significantly different (p < 0.05).

Meso Am.0.0250.0170.5060.144
Northwest S. Am0.0690.1170.4020.300
Andes0.0210.0190.2350.160
East S. Am.0.1490.2020.5850.548

The X-chromosome STR data were used to examine population structure with the model-based approach implemented in the program STRUCTURE (Pritchard et al., 2000; Falush et al., 2003). In a continent-wide analysis (Fig. 3), the model with K= 4 identifies two Eastern South American populations (Ticuna and Aché) and the Northwest South American populations as separate clusters. Increasing K to 5 identifies an additional Northwest South American component (predominant in the Kogi and Arhuaco). At K= 6, a North American component is apparent, which is also slightly predominant in Meso Americans relative to other populations. In addition, the component that predominates in Andeans is also seen at a relatively high frequency in two Northwestern populations (Wayuu and Zenu) and in two of the Eastern South American populations (Guaraní and Kaingang). Previous STRUCTURE analysis of autosomal STRs produced similar patterns (Wang et al., 2007), the least variable groups forming distinctive clusters (particularly for East and Northwest South Americas). Autosomal STRs show somewhat less evidence of a differentiation between Meso Americans and Andeans than seen with X-STRs. An analysis restricted to the South American populations shows similar patterns (Fig. 4). At K= 3, predominant components are identified for the Eastern Aché and the Northwestern Kogi and Arhuaco. Increasing K to 4 identified the Eastern Ticuna as a separate cluster while at K= 5 an Andean component is defined, which is also predominant in the Northwestern Wayuu and Zenu, and in the Eastern Guarani and Kaingang.

image

Figure 3. Unsupervised analysis of population structure in America based on data for 38 X-STRs typed in 22 Native American populations, obtained using the STRUCTURE program. The number of clusters in a given plot is indicated by the value of K on the left (plots are shown only for K= 4–6). The geographic regions considered in other analyses and the name of each individual population, are indicated at the top and bottom of the figure, respectively. The left-to-right order of the individuals is the same in all plots.

Download figure to PowerPoint

image

Figure 4. Unsupervised analysis of South American population structure based on 38 X-STRs, obtained using the STRUCTURE program. The number of clusters in a given plot is indicated by the value of K on the left. The geographic regions considered in other analyses and the name of each individual population, are indicated at the top and bottom of the figure, respectively. The left-to-right order of the individuals is the same in all plots.

Download figure to PowerPoint

Population Relatedness

Principal component analysis (PCA) was used to examine the relatedness of populations based on the X-chromosome STR data. Figure 5A shows results excluding two East South American populations (the Ache and Kaingang) that represent extreme outliers (Figure S1 shows results for the full set of 22 populations). Figure 5B shows results after excluding four outliers seen in Figure 5A (the Northwest South American: Arhuaco, Kogi, Embera, and Guaymi). These PCA plots are consistent with the population diversity and structure analyses presented above in showing greatest differentiation among East and Northwest South American populations. Populations from these regions cluster at opposite ends relative to the two North American populations (Cree and Ojibwa), with the Andean and Meso-American populations occupying intermediate positions. Figure 5B shows a closer relatedness of Eastern and Northwest South American populations to the Andeans than to Meso Americans. Similar broad patterns of relatedness were observed in the phylogenetic tree reported by (Wang et al., 2007) based on the autosomal STR data collected in these same population samples.

image

Figure 5. First three components obtained by PCA of a pairwise genetic distance matrix obtained from allele frequency data for 38 X-chromosome STRs for 20 (A) and 16 (B) of the Native American populations examined here (results for the full 22 populations are shown in Figure S1). Population symbols and coloring are as in Figure 1.

Download figure to PowerPoint

Population Demography

We used three independent approaches to evaluate changes in population size with the different marker data obtained here. First, we applied tests based on the allele frequency distribution of X-STRs, as implemented in the program BOTTLENECK (Cornuet & Luikart, 1996). This program implements a test for departure from the expected relationship, at mutation–drift equilibrium, between the observed heterozygosity and the number of alleles. It has been shown that population contraction and expansion lead, respectively, to a transient excess or deficiency (respectively) of heterozygosity, relative to that expected based on the observed number of alleles (Nei & Chakraborty, 1975; Maruyama & Fuerst, 1984). This is due to allelic diversity varying more rapidly than heterozygosity with changes in population size, rare alleles disappearing after a contraction or becoming more common after an expansion. BOTTLENECK results are summarized in Table 4 and illustrative X-STR allele frequency distributions are shown in Figure 6 (graphs for all populations are shown in Figure S2). Several populations, mainly those from Eastern and Northwest South America (e.g. the Aché) show few low-frequency alleles, consistent with a recent population contraction. Populations from Meso America and the Andes mostly show an opposite trend, with an apparent excess of rare alleles, consistent with population expansion. Overall, seven of the 18 populations tested deviate from mutation–drift equilibrium based on the “standardized difference” test implemented in BOTTLENECK (Table 4, Cornuet & Luikart, 1996). Second, we applied Fu's Fs test to the mtDNA D-loop sequences obtained here (Table 4). This test evaluates whether the number of different sequences found exceeds that expected for a population at equilibrium, based on the observed nucleotide diversity, and is characterized by negative values in expanding populations (Fu, 1997; Excoffier & Schneider, 1999). Among the Native American populations examined negative Fs values were observed in all the Andean populations and in some of the Meso and North Americans. The Northwestern South and Eastern South American populations all show positive Fs values (Table 4). Third, we examined the X to autosomal STR diversity ratio (HX/HA). It has been shown that due to differences in ploidy this ratio is sensitive to changes in population size (Pool & Nielsen, 2007). Values of the HX/HA ratio in the Native American populations examined are shown in Table 4. The mean HX/HA values for each region are: 0.907 (Northwestern South America), 0.918 (East South America), 0.950 (North America), 0.952 (Andean South America), and 0.998 (Meso America). There is a highly significant negative correlation between the Fs values obtained from the mtDNA D-loop sequences and the HX/HA ratios (rSpearman=−0.592; p = 0.005).

Table 4.  Evaluation of deviation from mutation–drift equilibrium with different marker systems in the Native American populations examined
PopulationRegionX-STRsmtDNAHX/HA
  1. Only populations with data for > 10 samples are included. Results for the standardized difference test (Cornuet & Luikart, 1996) and Fu's Fs are shown for X-STRs and mtDNA, respectively. Values shown in bold are significant at the 5% level.

AymaraAndes0.180−5.770.960
HuillicheAndes0.002−10.440.956
IngaAndes0.0353.730.929
QuechuaAndes0.005−8.290.963
AchéEast S. Am.0.0445.010.917
TicunaEast S. Am.0.0062.540.918
KaqchikelMeso Am.0.499−4.831.036
MixeMeso Am.0.3811.950.983
MixtecMeso Am.0.440−1.20.989
ZapotecMeso Am.0.005−1.740.985
CreeNorth Am.0.370−0.10.963
OjibwaNorth Am.0.4182.980.936
ArhuacoNorthWest S Am.0.0004.620.810
CabecarNorthWest S Am.0.2830.340.979
GuaymiNorthWest S Am.0.1811.950.896
KogiNorthWest S. Am.0.2803.510.924
WaunanaNorthWest S. Am.0.3772.30.915
ZenuNorthWest S. Am.0.1824.980.916
image

Figure 6. Histograms showing the proportion of alleles within frequency bins for 38 X-STRs in four Native American populations. Plots for the full set of 22 populations examined are displayed in Figure S2.

Download figure to PowerPoint

Discussion

  1. Top of page
  2. Summary
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

The mtDNA, X-chromosome, and Y-chromosome data obtained here are globally consistent with the previously reported autosomal STR data in showing a north to south gradient in genetic diversity (and an increase in population differentiation) in the Americas. This trend extends the pattern of decreasing population diversity seen worldwide as a function of distance from Africa (Ramachandran et al., 2005; Handley et al., 2007; Wang et al., 2007; Jakobsson et al., 2008; Li et al., 2008). In the Americas, this gradient is consistent with an initial settlement from Northeast Asia into the extreme Northwest of the continent, followed by a southward colonization (Wang et al., 2007). Interestingly, as reported based on autosomal STRs (Wang et al., 2007), the mtDNA and X-STR data obtained here (in the same population samples) are also consistent with the coasts playing an important role during the initial human population dispersal in the American continent. For these various sets of markers, the correlation of population diversity with distance from the Northwest increases when considering “least cost” distances incorporating the coastal outline, coinciding in a maximum correlation around a 1:10 coast/inland cost ratio. The role of the coasts in the initial human population dispersal outside of Africa has become the subject of increased scrutiny in the last few years, with a growing body of evidence pointing to the coasts as facilitators of migration (Liu et al., 2006). In the Americas, although classical models of settlement posited an inland migration (Fiedel, 2000; Dixon, 2001) there is growing interest in the possibility of a coastal migration route (Dixon, 2001; Fix, 2002, 2005). Recent geological research indicates the existence of ice-free coastal areas west of the Cordilleran ice sheet ∼14,000 years ago, considerably earlier than the proposed time of opening of an inland ice-free corridor (∼11,000 years ago). Computer simulations indicate that coastal migration models could be more consistent with patterns of mtDNA and classical marker data (Surovell, 2003; Fix, 2005). Recently, full mtDNA genome sequence data have also been interpreted as supporting a coastal population dispersal in the Americas (Fagundes et al., 2008).

Previous research has evidenced contrasting patterns of genetic variation in Western (mostly Andean) versus Eastern South America, characterized by lower within-population diversity (and greater differentiation) in the East relative to the West (Luiselli et al., 2000; Tarazona-Santos et al., 2001; Wang et al., 2007). Craniometric analyses also indicate an important differentiation between populations of East and West South America (Pucciarelli et al., 2006). The data examined here for uniparental and biparental markers support this view in showing the lowest within-population diversity and the highest differentiation in Eastern South America. By contrast, Andean populations show the highest diversity and lowest differentiation in the region, while populations from Northwest South America show intermediate values (Tables 2–3). These observations point to a differing demographic history across the Americas, with lower genetic drift in Meso America and in the Andes relative to the rest of South America. The availability of data for different genetic marker systems provides further opportunities to explore possible differences in the demographic history of these populations. The various marker systems examined here allow additional, independent, assessments of demographic scenarios. For instance, Fu's F test has been previously applied to worldwide mtDNA D-loop sequence data and has been shown to reveal a signal of population expansion (as evidenced by negative Fs) in all human populations, except certain hunter gatherers and Native Americans (Excoffier & Schneider, 1999). The lack of a signal of expansion in these populations has been interpreted as resulting from recent bottlenecks that have erased the signal of past population growth; or that effective sizes during initial spatial population expansions were smaller for populations with positive Fs (Excoffier & Schneider, 1999; Ray et al., 2003). In our survey, we find that Fs, calculated from mtDNA D-loop sequences, are negative in populations from Meso America and the Andes, while populations from East and Northwest South America are characterized by positive F values (Table 4). The mtDNA data are thus consistent with population expansion in Meso America and the Andes, but not in North West and East South America. Independent evidence of a contrasting demographic history in the Americas is provided by the tests applied to the X chromosome STR data shown in Table 4. These are also suggestive of bottlenecks in East and Northwest South America and of population expansion in Meso America and the Andes. The data collected here also allow a (rough) evaluation of the variable impact that changes in population size could have on the relative diversity of different marker systems in the Americas. It has been shown that population contraction can lead to relatively lower ratios of X to autosomal diversity (Pool & Nielsen, 2007), as observed in populations from East and Northwest South America (Table 4; equilibrium values being dependent on the mutational properties of STRs and population effective sizes). Similarly, estimates of population differentiation obtained with various marker systems will be affected differently by the unequal impact of changes in population size on estimates of intrapopulation diversity. At a continent-wide level we observe increasing population structure with data from autosomes, the X-chromosome, mtDNA, and the Y-chromosome (FST of 0.068, 0.094, 0.256, and 0.390, respectively). However, there is considerable regional variation in FST values between marker systems (Table 3). It is interesting to note that FST estimates based on autosomal STRs for Meso America and the Andes are higher than those obtained with X-STRs for the same regions. The opposite is seen in North=west and East South America. As mentioned above, after a change in population size, diversity on the X-chromosome is expected to reach new equilibrium values more rapidly than that on the autosomes. Consequently, although after a population contraction the difference between F(X)ST and F(A)ST will be increased over equilibrium values, the opposite will occur if there is a population expansion. Consistent with this scenario is the observation that higher FST estimates are observed with the Y-chromosome than with mtDNA across all regions of the Americas (Table 3). A number of previous studies have examined population structure based on mtDNA and Y-chromosome data in order to evaluate differences in migration rates between men and women (Cavalli-Sforza & Bodmer, 1971; Mesa et al., 2000; Bortolini et al., 2002; Segurel et al., 2008), and it has been shown that such differences in migration rates can also reduce the expected difference between F(X)ST and F(A)ST (Segurel et al., 2008). Nevertheless, population genetic theory indicates that F(A)ST is not expected to exceed F(X)ST. For this to happen, additional demographic factors need to be invoked, such as a greater variance in reproductive success in men than in women (Cavalli-Sforza & Bodmer, 1971; Segurel et al., 2008). Overall, our results point to larger long-term effective population sizes in Andean relative to the rest of South America, possibly associated with events of population expansion in the Andes and contraction elsewhere. Further assessment of this scenario will require further analyses, accouting for the specific mutational properties of the markers examined and the fine-grained population structure patterns in each region (Stadler et al., 2009).

From a population history perspective, our results point to Andean populations separating from other South Americans early in the colonization of the subcontinent and subsequently maintaining larger population sizes relative to other South American populations (Rothhammer & Dillehay, 2009). This early split of Andeans and non-Andeans is also consistent with the geographic distribution of Amerind languages in that the Andean stock extends furthest South, along the western side of the subcontinent, and as such is likely to represent the initial settlers of the region (Cavalli-Sforza et al., 1994). Archaeological evidence suggests that human population sizes were relatively large along the Pacific coast, even prior to the development of agriculture, sustained by the rich local fishing resources (Bellwood, 2001). Within South America, agriculture developed first along the central Andes resulting in the establishment in this region of a dense population, culminating in the development of the Inca Empire (Bellwood, 2001). Populations in Western South America therefore seem to have had relatively large sizes for a considerable amount of time and also to have undergone a relatively recent expansion, possibly associated with the use of intensive agriculture. Archaeological information indicates that developments in the Northwestern, and particularly in Eastern South America, were more incipient than in the Andes and were associated with relatively lower population densities.

In conclusion, the nuclear and mtDNA data analyzed here are consistent with an important role for the coasts in the initial settlement of the American continent. Although our analyses did not aim at estimating specific evolutionary parameters, our observations suggest a differentiated demographic history between regions in South America, perhaps associated with an early divergence of Andean from non-Andean populations. Examining a larger number of Native populations (particularly increasing geographic coverage) should allow a more detailed evaluation of the routes of dispersal during the settlement of the American continent. Other than evaluating more precisely the role of the coasts in this dispersal, it will be interesting to examine the impact of other major geographic features of the continent, including the main mountain ranges and river valleys. The feasibility of genome-wide population surveys of increasing resolution will soon culminate in a full description of the genetic diversity of human population samples (i.e. individual genome sequences). The application of explicit methods for testing alternative evolutionary scenarios (and estimating their associated parameters) to such data should enable a refined assessment of the initial settlement and routes of population dispersal in the American continent.

Acknowledgements

  1. Top of page
  2. Summary
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

We are very grateful to the volunteers who contributed samples to this project. We thank M. Villena and P. Herrera for assistance. NNY was supported by an Overseas Research Student award, a fellowship from the KC Wong Education Foundation and the Charlotte and Yule Bogue Research Fund (UCL). This work was partially funded by the Programme Interdisciplinaire CNRS Amazonie (Analyse, modélisation et ingénierie des systèmes amazoniens), the Brazilian Conselho Nacional de Desenvolvimento Científico e Tecnológico, the Fundação de Amparo à Pesquisa do Estado do Rio Grande do Sul, Colciencias grant 1115–40-520279 and CODI (Sostenibilidad 2009–2011) Universidad de Antioquia.

References

  1. Top of page
  2. Summary
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information
  • Anderson, S., Bankier, A. T., Barrell, B. G., De Bruijn, M. H., Coulson, A. R., Drouin, J., Eperon, I. C., Nierlich, D. P., Roe, B. A., Sanger, F., Schreier, P. H., Smith, A. J., Staden, R. & Young, I. G. (1981) Sequence and organization of the human mitochondrial genome. Nature 290, 457465.
  • Bellwood, P. (2001) Early agriculturalist population diasporas? Farming, languages, and genes. Annu Rev Anthropol 30, 181207.
  • Bonatto, S. L. & Salzano, F. M. (1997) Diversity and age of the four major mtDNA haplogroups, and their implications for the peopling of the New World. Am J Hum Genet 61, 14131423.
  • Bortolini, M. C., Salzano, F. M., Bau, C. H., Layrisse, Z., Petzl-Erler, M. L., Tsuneto, L. T., Hillk  , Hurtado, A. M., Castro-De-Guerra, D., Bedoya, G. & Ruiz-Linares, A. (2002) Y-chromosome biallelic polymorphisms and Native American population structure. Ann Hum Genet 66, 255259.
  • Bortolini, M. C., Salzano, F. M., Thomas, M. G., Stuart, S., Nasanen, S. P., Bau, C. H., Hutz, M. H., Layrisse, Z., Petzl-Erler, M. L., Tsuneto, L. T., Hill, K., Hurtado, A. M., Castro-De-Guerra, D., Torres, M. M., Groot, H., Michalski, R., Nymadawa, P., Bedoya, G., Bradman, N., Labuda, D. & Ruiz-Linares, A. (2003) Y-chromosome evidence for differing ancient demographic histories in the Americas. Am J Hum Genet 73, 524539.
  • Bourgeois, S., Yotova, V., Wang, S., Bourtoumieu, S., Moreau, C., Michalski, R., Moisan, J. P., Hill, K., Hurtado, A. M., Ruiz-Linares, A., & Labuda, D. (2009) X-chromosome lineages and the settlement of the Americas. Am J Phys Anthropol 140, 417428.
  • Brandstatter, A., Peterson, C. T., Irwin, J. A., Mpoke, S., Koech, D. K., Parson, W., & Parsons, T. J. (2004) Mitochondrial DNA control region sequences from Nairobi (Kenya): inferring phylogenetic parameters for the establishment of a forensic database. Int J Legal Med 118, 294306.
  • Cavalli-Sforza, L. L. & Bodmer, W. F. (1971) The genetics of human populations. San Francisco , CA : Freeman.
  • Cavalli-Sforza, L. L., Menozzi, P., & Piazza, A. (1994) The History and Geography of Human Genes. Princeton , NJ : Princeton University Press.
  • Chinnery, P. F., Howell, N., Andrews, R. M., & Turnbull, D. M. (1999) Mitochondrial DNA analysis: polymorphisms and pathogenicity. J Med Genet 36, 505510.
  • Cornuet, J. M. & Luikart, G. (1996) Description and power analysis of two tests for detecting recent population bottlenecks from allele frequency data. Genetics 144, 20012014.
  • Dixon, E. J. (2001) Human colonization of the Americas: timing, technology and process. Quat Sci Rev 20, 277299.
  • Excoffier, L. & Schneider, S. (1999) Why hunter-gatherer populations do not show signs of pleistocene demographic expansions. Proc Natl Acad Sci USA 96, 1059710602.
  • Excoffier, L., Laval, G., & Schneider, S. (2005) Arlequin (version 3.0): an integrated software package for population genetics data analysis. Evol Bioinform Online 1, 4750.
  • Fagundes, N. J., Kanitz, R., Eckert, R., Valls, A. C., Bogo, M. R., Salzano, F. M., Smith, D. G., Silva, W. A., Jr., Zago, M. A., Ribeiro-Dos-Santos, A. K., Santos, S. E., Petzl-Erler, M. L., & Bonatto, S. L. (2008) Mitochondrial population genomics supports a single pre-Clovis origin with a coastal route for the peopling of the Americas. Am J Hum Genet 82, 583592.
  • Falush, D., Stephens, M., & Pritchard, J. K. (2003) Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164, 15671587.
  • Fiedel, S. J. (2000) The peopling of the New World: present evidence, new theories, and future directions. J Archaeol Res 8, 39103.
  • Fix, A. G. (2002) Colonization models and initial genetic diversity in the Americas. Hum Biol 74, 110.
  • Fix, A. G. (2005) Rapid deployment of the five founding Amerind mtDNA haplogroups via coastal and riverine colonization. Am J Phys Anthropol 128, 430436.
  • Forster, P., Harding, R., Torroni, A., & Bandelt, H. J. (1996) Origin and evolution of Native American mtDNA variation: a reappraisal. Am J Hum Genet 59, 935945.
  • Fu, Y. X. (1997) Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics 147, 915925.
  • Handley, L. J., Manica, A., Goudet, J., & Balloux, F. (2007) Going the distance: human population genetics in a clinal world. Trends Genet 23, 432439.
  • Jakobsson, M., Scholz, S. W., Scheet, P., Gibbs, J. R., Vanliere, J. M., Fung, H. C., Szpiech, Z. A., Degnan, J. H., Wang, K., Guerreiro, R., Bras, J. M., Schymick, J. C., Hernandez, D. G., Traynor, B. J., Simon-Sanchez, J., Matarin, M., Britton, A., Van De Leemput, J., Rafferty, I., Bucan, M., Cann, H. M., Hardy, J. A., Rosenberg, N. A., & Singleton, A. B. (2008) Genotype, haplotype and copy-number variation in worldwide human populations. Nature 451, 9981003.
  • Karafet, T. M., Zegura, S. L., Posukh, O., Osipova, L., Bergen, A., Long, J., Goldman, D., Klitz, W., Harihara, S., De Knijff, P., Wiebe, V., Griffiths, R. C., Templeton, A. R., & Hammer, M. F. (1999) Ancestral Asian source(s) of new world Y-chromosome founder haplotypes. Am J Hum Genet 64, 817831.
  • Keinan, A., Mullikin, J. C., Patterson, N., & Reich, D. (2009) Accelerated genetic drift on chromosome X during the human dispersal out of Africa. Nat Genet 41, 6670.
  • Lell, J. T., Sukernik, R. I., Starikovskaya, Y. B., Su, B., Jin, L., Schurr, T. G., Underhill, P. A., & Wallace, D. C. (2002) The dual origin and Siberian affinities of Native American Y chromosomes. Am J Hum Genet 70, 192206.
  • Li, J. Z., Absher, D. M., Tang, H., Southwick, A. M., Casto, A. M., Ramachandran, S., Cann, H. M., Barsh, G. S., Feldman, M., Cavalli-Sforza, L. L., & Myers, R. M. (2008) Worldwide human relationships inferred from genome-wide patterns of variation. Science 319, 11001104.
  • Liu, H., Prugnolle, F., Manica, A., & Balloux, F. (2006) A geographically explicit genetic model of worldwide human-settlement history. Am J Hum Genet 79, 230237.
  • Liu, K. & Muse, S. V. (2005) PowerMarker: an integrated analysis environment for genetic marker analysis. Bioinformatics 21, 21282129.
  • Luiselli, D., Simoni, L., Tarazona-Santos, E., Pastor, S., & Pettener, D. (2000) Genetic structure of Quechua-speakers of the Central Andes and geographic patterns of gene frequencies in South Amerindian populations. Am J Phys Anthropol 113, 517.
  • Maruyama, T. & Fuerst, P. A. (1984) Population bottlenecks and nonequilibrium models in population genetics. I. Allele numbers when populations evolve from zero variability. Genetics 108, 745763.
  • Merriwether, D. A., Rothhammer, F., & Ferrell, R. E. (1995) Distribution of the four founding lineage haplotypes in Native Americans suggests a single wave of migration for the New World. Am J Phys Anthropol 98, 411430.
  • Mesa, N. R., Mondragon, M. C., Soto, I. D., Parra, M. V., Duque, C., Ortiz-Barrientos, D., Garcia, L. F., Velez, I. D., Bravo, M. L., Munera, J. G., Bedoya, G., Bortolini, M. C., & Ruiz-Linares, A. (2000) Autosomal, mtDNA, and Y-chromosome diversity in Amerinds: pre- and post-Columbian patterns of gene flow in South America. Am J Hum Genet 67, 12771286.
  • Nei, M. & Chesser, R. K. (1983) Estimation of fixation indices and gene diversities. Ann Hum Genet 47, 253259.
  • Nei, M. M. T. & Chakraborty, R. (1975) The bottleneck effect and genetic variability in populations. Evolution 29, 110.
  • Pool, J. E. & Nielsen, R. (2007) Population size changes reshape genomic patterns of diversity. Evolution 61, 30013006.
  • Pritchard, J. K., Stephens, M., & Donnelly, P. (2000) Inference of population structure using multilocus genotype data. Genetics 155, 945959.
  • Pucciarelli, H. M., Neves, W. A., Gonzalez-Jose, R., Sardi, M. L., Rozzi, F. R., Struck, A., & Bonilla, M. Y. (2006) East-West cranial differentiation in pre-Columbian human populations of South America. Homo 57, 133150.
  • Ramachandran, S., Deshpande, O., Roseman, C. C., Rosenberg, N. A., Feldman, M. W., & Cavalli-Sforza, L. L. (2005) Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa. Proc Natl Acad Sci USA 102, 1594215947.
  • Ray, N. (2005) PATHMATRIX: a geographical information system tool to compute effective distances among samples. Mol Ecol Notes 5, 177180.
  • Ray, N., Currat, M., & Excoffier, L. (2003) Intra-deme molecular diversity in spatially expanding populations. Mol Biol Evol 20, 7686.
  • Rosenberg, N. A. (2004) DISTRUCT: a program for the graphical display of population structure. Mol Ecol Notes 4, 137138.
  • Rothhammer, F. & Dillehay, T. D. (2009) The late Pleistocene colonization of South America: an interdisciplinary perspective. Ann Hum Genet 73, 540549.
  • Ruhlen, M. (1991) A guide to the world's languages. Stanford , CA : Stanford University Press.
  • Ruiz-Linares, A., Ortiz-Barrientos, D., Figueroa, M., Mesa, N., Munera, J. G., Bedoya, G., Velez, I. D., Garcia, L. F., Perez-Lezaun, A., Bertranpetit, J., Feldman, M. W., & Goldstein, D. B. (1999) Microsatellites provide evidence for Y chromosome diversity among the founders of the New World. Proc Natl Acad Sci USA 96, 63126317.
  • Salzano, F. M. & Callegari-Jacques, S. M. (1988) South American Indians: a case study in evolution. Oxford , Clarendon Press.
  • Segurel, L., Martinez-Cruz, B., Quintana-Murci, L., Balaresque, P., Georges, M., Hegay, T., Aldashev, A., Nasyrova, F., Jobling, M. A., Heyer, E., & Vitalis, R. (2008) Sex-specific genetic structure and social organization in Central Asia: insights from a multi-locus study. PLoS Genet 4, e1000200.
  • Seielstad, M., Yuldasheva, N., Singh, N., Underhill, P., Oefner, P., Shen, P., & Wells, R. S. (2003) A novel Y-chromosome variant puts an upper limit on the timing of first entry into the Americas 1. Am J Hum Genet 73, 700705.
  • Siegel, S. (1956) Nonparametric statistics for the behavioural sciences. Tokyo : McGraw-Hill.
  • Stadler, T., Haubold, B., Merino, C., Stephan, W., & Pfaffelhuber, P. (2009) The impact of sampling schemes on the site frequency spectrum in nonequilibrium subdivided populations. Genetics 182, 205216.
  • Surovell, T. A. (2003) Simulating coastal migration in New World colonization. Curr Anthropol 44, 580591.
  • Tarazona-Santos, E., Carvalho-Silva, D. R., Pettener, D., Luiselli, D., De Stefano, G. F., Labarga, C. M., Rickards, O., Tyler-Smith, C., Pena, S. D., & Santos, F. R. (2001) Genetic differentiation in South Amerindians is related to environmental and cultural diversity: evidence from the Y chromosome. Am J Hum Genet 68, 14851496.
  • Torroni, A., Schurr, T. G., Cabell, M. F., Brown, M. D., Neel, J. V., Larsen, M., Smith, D. G., Vullo, C. M., & Wallace, D. C. (1993) Asian affinities and continental radiation of the four founding Native American mtDNAs. Am J Hum Genet 53, 563590.
  • Wang, S., Bedoya, G., Labuda, D., & Ruiz-Linares, A. (2010) Brief communication: patterns of linkage disequilibrium and haplotype diversity at Xq13 in six Native American populations. Am J Phys Anthropol 142, 476480.
  • Wang, S., Lewis, C. M., Jakobsson, M., Ramachandran, S., Ray, N., Bedoya, G., Rojas, W., Parra, M. V., Molina, J. A., Gallo, C., Mazzotti, G., Poletti, G., Hill, K., Hurtado, A. M., Labuda, D., Klitz, W., Barrantes, R., Bortolini, M. C., Salzano, F. M., Petzl-Erler, M. L., Tsuneto, L. T., Llop, E., Rothhammer, F., Excoffier, L., Feldman, M. W., Rosenberg, N. A., & Ruiz-Linares, A. (2007) Genetic Variation and Population Structure in Native Americans. PLoS Genet 3, e185.
  • Wilkins, J. F. (2006) Unraveling male and female histories from human genetic data. Curr Opin Genet Dev 16, 611617.

Supporting Information

  1. Top of page
  2. Summary
  3. Introduction
  4. Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

Table S1 Sequence of primers used for amplification of Y-STRs

Figure S1 First three components obtained by PCA of a pairwise genetic distance matrix obtained from allele frequency data for 38 X-chromosome STRs for the full set of 22 Native American populations examined here. Population symbols and colouring are as in Figure 1 of the main paper.

Figure S2 Histograms showing the proportion of alleles within particular frequency bins across 38 X-STRs for the 22 Native American populations examined here.

As a service to our authors and readers, this journal provides supporting information supplied by the authors. Such materials are peer-reviewed and may be re-organized for online delivery, but are not copy-edited or typeset. Technical support issues arising from supporting information (other than missing files) should be addressed to the authors.

FilenameFormatSizeDescription
AHG_608_sm_figS1.pdf290KSupporting info item
AHG_608_sm_figS2.pdf318KSupporting info item
AHG_608_sm_figS2_continued.pdf312KSupporting info item
AHG_608_sm_SuppMat.doc39KSupporting info item

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.