Diachronic Investigations of Mitochondrial and Y-Chromosomal Genetic Markers in Pre-Columbian Andean Highlanders from South Peru


  • Lars Fehren-Schmitz,

    Corresponding author
    1. Historic Anthropology and Human Ecology, Johann-Friedrich-Blumenbach Department of Zoology and Anthropology, University Goettingen, Goettingen 37073, Germany
    Search for more papers by this author
  • Ole Warnberg,

    1. Historic Anthropology and Human Ecology, Johann-Friedrich-Blumenbach Department of Zoology and Anthropology, University Goettingen, Goettingen 37073, Germany
    Search for more papers by this author
  • Markus Reindel,

    1. German Archaeological Institute (DAI), Commission for Archaeology of Non-European Cultures (KAAK), Bonn 53173, Germany
    Search for more papers by this author
  • Verena Seidenberg,

    1. Historic Anthropology and Human Ecology, Johann-Friedrich-Blumenbach Department of Zoology and Anthropology, University Goettingen, Goettingen 37073, Germany
    Search for more papers by this author
  • Elsa Tomasto-Cagigao,

    1. Departmento de Humanidades, Pontificia Universidad Católica del Perú, Lima 32, Perú
    Search for more papers by this author
  • Johny Isla-Cuadrado,

    1. Instituto Andino de Estudios Arqueológicos (INDEA), Lima 11, Perú
    Search for more papers by this author
  • Susanne Hummel,

    1. Historic Anthropology and Human Ecology, Johann-Friedrich-Blumenbach Department of Zoology and Anthropology, University Goettingen, Goettingen 37073, Germany
    Search for more papers by this author
  • Bernd Herrmann

    1. Historic Anthropology and Human Ecology, Johann-Friedrich-Blumenbach Department of Zoology and Anthropology, University Goettingen, Goettingen 37073, Germany
    Search for more papers by this author

  • Grant sponsorship: German Federal Ministry of Education and Research (BMBF); Grant number: 01UA0804B.

Corresponding author: Dr. Lars Fehren-Schmitz, Historic Anthropology and Human Ecology, Johann-Friedrich-Blumenbach
Department of Zoology and Anthropology, University of Goettingen, Buergerstrasse 50, 37073 Goettingen. Tel: +49 551/39-22249;
Fax: +49 551/39-3645; E-mail: lfehren@gwdg.de


This study examines the reciprocal effects of cultural evolution, and population dynamics in pre-Columbian southern Peru by the analysis of DNA from pre-Columbian populations that lived in the fringe area between the Andean highlands and the Pacific coast. The main objective is to reveal whether the transition from the Middle Horizon (MH: 650–1000 AD) to the Late Intermediate Period (LIP: 1000–1400 AD) was accompanied or influenced by population dynamic processes. Tooth samples from 90 individuals from several archaeological sites, dating to the MH and LIP, in the research area were collected to analyse mitochodrial, and Y-chromosomal genetic markers. Coding region polymorphisms were successfully analysed and replicated for 72 individuals, as were control region sequences for 65 individuals and Y-chromosomal single nucleotide polymorphisms (SNPs) for 19 individuals, and these were compared to a large set of ancient and modern indigenous South American populations. The diachronic comparison of the upper valley samples from both time periods reveals no genetic discontinuities accompanying the cultural dynamic processes. A high genetic affinity for other ancient and modern highland populations can be observed, suggesting genetic continuity in the Andean highlands at the latest from the MH. A significant matrilineal differentiation to ancient Peruvian coastal populations can be observed suggesting a differential population history.


The pre-Columbian cultural landscape of the western Central Andean region is characterised by two adjacent main cultural areas with extremely differing ecological conditions: the Pacific Coast including the western slopes of the Andes, and the Andean Highlands. In a time span of approximately 13,000 years (Dixon, 2001; Dillehay, 2009) the areas faced the transition from small nomadic groups to sedentism, the evolution of complex societies and even states. These developments are results of complex patterns of interaction between the areas with constantly altering reciprocal interferences, and changes in intensity and direction of cultural transmission. The question, to what extent these cultural processes were accompanied or influenced by population dynamic processes and alterations of mobility patterns and even gene flow remains mostly unanswered.

The understanding of human genetic diversity in South America and the processes involved in the peopling of the continent has greatly improved in the last years through several population genetic studies analysing mitochondrial, Y-chromosomal, and autosomal markers in contemporary indigenous populations (e.g., Fuselli et al., 2003; Wang et al., 2007; Lewis & Long, 2008; Lewis, 2009b). However, to reveal prehistoric population affinities and gene flow between pre-Columbian populations and the basis of modern diversity, it is necessary to consider the diachronic, microevolutionary perspective through the employment of palaeogenetic methods. The analysis of DNA from prehistoric human remains proved to be a powerful tool to reveal such processes when used appropriately (Paabo et al., 2004; Bramanti et al., 2009). For South America the number of palaeogenetic studies is still small. Only recently there has been an increased interest in studies of pre-Columbian Central Andean populations from the coast (Shimada et al., 2004; Moraga et al., 2005; Fehren-Schmitz et al., 2009, 2010) and the highlands (Shinoda et al., 2006; Lewis et al., 2007a; Kemp et al., 2009; Carnese et al., 2010).

Current archaeological data indicates that the initial development of social, technological, and demographic complexity in both main Central Andean cultural areas did not occur simultaneously (Dillehay et al., 2004). The level of mutual exchange increases with the Initial Period (1800–900 BC) and becomes apparent with the emergence of the Chavin style in the Early Horizon (800–200 BC) (Burger, 1981; Kembel & Rick, 2004). Especially for the central and southern areas there is no evidence that this spread was accompanied by significant population biological processes (Burger, 2008). This assumption is supported by palaeogenetic studies that verify a significant genetic differentiation between the prehistoric coastal and highland populations in Southern Peru at least for a period ranging from 800 BC to 650 AD even though cultural influences of coastal cultures are evident in the adjacent highlands (Shimada et al., 2004; Shinoda et al., 2006; Fehren-Schmitz, 2008; Fehren-Schmitz et al., 2009, 2010). With the transition to the Middle Horizon (600–1000 AD) the first highland state societies like the Wari occur that spread their cultural traditions intentionally throughout the central Andean area through partially aggressive expansion, state influenced migration, and relocation (Schreiber, 1992; Tung, 2007; Heggarty, 2008). This processes supposedly had influence on migrational behaviour, and on the demography and genetic composition of the Central Andean populations through urbanisation and alterations of social complexity (D´Altroy & Schreiber, 2004; Tung, 2008). With the Late Intermediate Period more locally restricted cultural groups with varying social complexity arose (1000–1400) altering the relation and spatial dominancy between coast and highlands and probably the directions and intensity of human mobility and therewith gene flow. However, recent palaeogenetic investigations on highland populations suggest genetic continuity in the Central Andean highlands at the latest from the MH on (Lewis et al., 2007a; Kemp et al., 2009). Considering the distinctness of the earlier pre-Columbian coastal and highland populations it can be assumed that if there were any significant population biological changes resulting from the cultural interactions of the main cultural domains they would have left their imprint in the direct fringe area of both major cultural zones.

In the course of a joint interdisciplinary project of Peruvian and German archaeologists and natural scientists we had the opportunity to collect samples for a DNA analysis from six archaeological sites in South Peru situated in the upper valleys of the Rio Palpa and Rio Viscas directly adjacent to the Altiplano (Fig. 1), thus representing a fringe area as described above. Additionally, the investigation area is in the proximity of the coastal sites analysed in a previous archaeogenetic study (Fehren-Schmitz et al., 2010). Here we analyse mitochondrial and Y-chromosomal genetic markers from chronologically succeeding skeletal collectives to characterise the pre-Columbian human genetic diversity and to reveal if there are diachronic changes in the genetic population composition. The comparison of the diversity observed in the upper valley populations with modern and pre-Columbian coastal and highland populations will permit us to determine to what extent cultural interactions between the coastal and the highland areas were accompanied or influenced by human population dynamic processes. Since types of human migration and mobility can be sex specific processes (Sanjek, 2003) matrilineal and patrilineal genetic markers are observed.

Figure 1.

Map of the investigation area in South Peru. The crosses mark the sampled archaeological sites (cf. Table 1).

Material and Methods

Sites and Samples

The area of investigation is formed by the upper valleys of the Rio Palpa and Rio Viscas directly adjacent to the Andean Puna at an altitude of approximately 3000–4000 m—Quechua and Suni ecological zones (Pulgar-Vidal, 1979)—surrounding the modern town of Laramate, Lucanas Province, Region Ayacucho (Fig. 1). The area is situated about 50 km to the east of the archaeological sites in the northern Rio Grande de Nasca drainage—Chala ecological zone—investigated in the previous project (Reindel & Wagner, 2009; Fehren-Schmitz et al., 2010). In two field campaigns (2008, 2009) bone and teeth samples from 74 human individuals were collected by the main author (LF). The sampled individuals derive from settlement associated burial caves and Chullpas, typical stone build burial structures in the highlands, dating to the Middle Horizon and the Late Intermediate Period (Table 1). Additionally, 16 individuals from the site of Pacapaccari were already sampled in the course of a previous project (Fehren-Schmitz et al., 2010). The largest skeletal series was collected in the Cueva Yacotogia. The burial cave contained human remains from about 60 individuals with a regular sex and age distribution. The archaeological site could be dated to the M H trough associated findings belonging to the Wari culture and direct Accelerator Mass Spectrometry (AMS) 14C dating of bones revealing an age of 1187 ± 26 BP, σ1 cal. AD 782–883 (cal. trough INTCAL04, Lab-Number Hd-28475, Environmental Physics – University Heidelberg). All information regarding the other sites is found in Table 1. The assigned dates of the sampled individuals are based on the archaeological context (closed association) in the graves and secondary on the stratigraphic information and general archaeological association of the cemeteries. Although there are absolute numerical dates for most sites (Table 1), only relative chronological dates—cultural periods—are applied here because most of the absolute dates have not been obtained from the bones themselves.

Table 1.  Sampled archaeological sites, dates and coordinates
Siten1Burial typeTime period214C Age (years BP)3Altitude
  1. 1n= number of sampled individuals.

  2. 2MH = Middle Horizon, LIP = Late Intermediate Period, LH = Late Horizon.

  3. 3only listed when dated from bone material.

Yacotogia31CaveMH1187 ± 263420 m
Ocoro10ChullpaMHn.d.3350 m
Pacapaccari16ChullpaLIP820 ± 243012 m
Botigiriayocc15CaveLIPn.d.3535 m
Huayuncalla8ChullpaLIP978 ± 263091 m
Layuni10AbriLIP-LHn.d.3557 m

Since most individuals were found disarticulated in the graves we only collected teeth from mandibles or a Pars petrosa to ensure that no individual is sampled twice. In general, two independent samples per individual were collected. They were instantly sealed to prevent contamination and then exported from Peru to Germany with the permission of the Institute of Peruvian Cultural Heritage (Instituto Nacional de Cultura del Perú).

Contamination Prevention

Previously, no modern DNA-based studies had been performed in our laboratories, which are dedicated to ancient DNA analysis. All analyses were carried out according to commonly applied precautions for the analysis of ancient DNA (e.g., Hummel, 2003), including a strict separation of pre- and post- PCR laboratories and the use of disposable protective clothing, glasses, and disposable laboratory gloves to prevent contamination by the staff handling the material. Experiments were carried out with disposable laboratory ware such as pipette tips and cups. Workbenches and all other laboratory equipment were treated with detergents (Alconox Detergent, Aldrich, Germany) bidistillated water and ethanol for DNA contamination removal. All disposable ware and most solutions, buffers, and MgCl2 were irradiated with ultraviolet light employing aluminium foil coating (Tamariz et al., 2006). Negative extraction controls and negative PCR controls were employed in this study.

DNA Samples were taken from every person that had contact with the sample material. A database was established containing the genetic data of all excavation helpers, archaeologists, project members in Peru, and all people working in our laboratories in Germany as well as information on which people had contact with which samples. This procedure allows a gapless documentation of people who might possibly contaminate the samples. Samples where a contamination by one of the people could not be ruled out were excluded from further analyses.

Sample Preparation and DNA Extraction

The preparation of the bone and tooth samples for DNA extraction followed a standardised protocol (Hummel, 2003). After the outer surfaces of the samples were amply removed with an electric drill with diamond tipped saw blade (K10, KaVo, Bieberbach, Germany), the fragments were mechanically pulverised. Decalcification of the samples and DNA extraction followed a previously described protocol (Fehren-Schmitz et al., 2010) employing automated DNA purification in a BioRobot EZ1 (Qiagen, Hilden, Germany). The DNA extract was afterward stored at −20°C.

For each individual we made two or more independent DNA extracts. In most cases, the different extracts from one individual originated from different skeletal samples, to allow an authentication of analysis results by comparison.

Mitochondrial HVR1 Analysis

To determine the mitochondrial haplotypes of the individuals we analysed a 388 bp fragment of the mitochondrial hypervariable region 1 (HVR1) (nucleotide positions 16,021–16,408, relative to the revised Cambridge reference sequence (rCRS); (Andrews et al., 1999). For this purpose we used a set of four primer pairs generating overlapping PCR products with a length range of 157 – 180bp. Primer sequences for the amplification and the description of PCR conditions are found in Fehren-Schmitz et al. (2010). PCR success and PCR product quantity were checked by gel electrophoresis on 2.5% Agarose-Gels. Following this, the products were purified with a MiniElute PCR Purification Kit (Qiagen) and subjected to direct sequencing. Sequence reactions were prepared using the HPLC-grade PCR primers and the BigDye Terminator v1.1 Kit (Applied Biosystems, Carlsbad, CA, USA). Taq Cycle-Sequencing conditions followed the instructions of the producer. Consequently, sequencing products were purified with the NucleoSeq Kit (Macherey-Nagel, Dueren, Germany) and analysed using an ABI Prism 310 Genetic Analyzer and supplied software (Applied Biosystems).

Each of the four overlapping segments of the HVR1 was amplified at least two times per extract. When the results of these replications were highly inconsistent the samples were discarded from further analysis. Punctual sequence aberrations from the consensus were classified as potential postmortem sequence degradations (Gilbert et al., 2003a, 2003b). In this case we performed additional amplifications of the conspicuous DNA segment and additionally sequenced the light-strand (Paabo et al., 2004). If inconsistencies could not be excluded and no consensus was found, the samples were also discarded from the dataset.

Mitochondrial Haplogroup Analysis

The HVR1 sequences—haplotypes—can be tentatively assigned to respective haplogroups according to specific mutations, but there is a high risk of false determination through fast evolving mutational hotspots and postmortal sequence aberrations (Meyer et al., 1999; Stoneking, 2000; Gilbert et al., 2003b; Bandelt, 2005). Therefore, we additionally analysed the specific coding region polymorphisms for the four Native American haplogroups A, B, C, and D (Torroni et al., 2006). To determine the haplogroups, we analysed three single nucleotide polymorphisms (SNPs) characterising groups A (np663 A to G), C (np13263 A to G), D (np 5178 C to A), and a 9 bp deletion between (np) 8272–8280 in the noncoding cytochrome oxidase II/tRNALys intergenic region characterising B (Schurr et al., 1990; Torroni et al., 1993; Merriwether et al., 1995). Additionally, we analysed a single nucleotide polymorphism (SNP) characterising macrogroup R*(np12705 C to T) involving the most frequent European mt-haplogroups (van Oven & Kayser, 2009) to discriminate against contamination from the German archaeologists and lab-members.

For the analysis of the five coding region polymorphisms, we developed a multiplex Single Base Extension (SBE) assay allowing a simultaneous amplification of the target sequences. All Primers for the amplification (Table 2) and the minisequencing (Table 3) reactions were designed using the Primer Select software (Lasergene 8.0 package, DNAstar). The minisequencing/SBE primers were designed one base contiguous to the polymorphic site of interest in either the forward or reverse orientation. Additionally, variable length polymeric-A tails were added to the 5´end of the primer in order to ensure an effective separation of the products during electrophoresis (Nelson et al., 2007).

Table 2.  Primers used for the multiplex amplification of the haplogroup specific SNPs and 9bp deletion
SystemPrimerSequence (5´−3´)Conc. (μM)1
  1. 1conc. = Primer concentration used in the primer mix for the multiplex amplification.

Table 3.  Primers used for the multiplex amplification of the Y-Chromosomal SNPs
HG1Mutation2PrimerSequence (5´−3´)Conc. (μM)3
  1. 1HG = Y-chromosomal Haplogroup.

  2. 2Haplogroup determining mutation following the Y-Chromosome Consortium nomenclature.

  3. 3conc. = Primer concentration used in the primer mix for the multiplex amplification.


The multiplex amplification of the five amplicons was carried out in a total volume of 50 μl containing 2.5 units AmpliTaqGold DNA hot start Polymerase, 2 mM MgCl2 (both: Applied Biosystems), 200 μM dNTP-Mix (Roche, Mannheim, Germany), 5 μl multiplex primer set (concentrations Tab. 2), and 1μl BSA (20 g/ml, Roche) in 10 × Buffer II (Applied Biosystems). As template we added 5–10 μl DNA extract and completed the final reaction volume with DNA and DNase free HPLC H2O (Merck). PCR amplification took place in a Mastercycler (Eppendorf, Hamburg, Germany) under the following conditions: initialisation at 94°C for 11 min; 35–40 cycles at 94°C for 1 min, 56 °C for 1 min, and 72°C for 1.5 min; final elongation 60°C for 30 min. PCR success and PCR product quantity were checked by gel electrophoresis. We used extra thin, highly concentrated 3.75% agarose gels poured on a slope allowing separation down to 3–4 bp in the 50–120 bp range (Hummel, 2003) and allowing direct identification of the 9 bp deletion between (np) 8272–8280 characterising haplogroup B. Following this, the products were purified with the MiniElute PCR Purification Kit (Qiagen). To determine, the polymorphisms for haplogroups A,C,D, and R single base extension was carried out using 2.5 μl SNaPshot Ready Reaction Mix (Applied Biosystems), 1 μl purified PCR product, 0.37 μl pooled SBE primer mix (final concentrations Table 3), and 1.13 μl HPLC H2O (Merck, Darmstadt, Germany) in a 10 μl final volume. The thermocycling conditions were as follows: 25 cycles at 96°C for 10 sec, 50°C for 5 sec, and 60°C for 30 sec. Afterwards the SBE products were purified through incubation with 1 μl alkaline Phosphatase (rAPid, Roche) at 37°C for 1 h. Analysis followed using an ABI Prism 310 Genetic Analyzer (Applied Biosystems). The samples were prepared for capillary electrophoresis following the instructions of the manufacturer.

Y-Chromosomal Haplogroup Analysis

To reveal the patrilineal population dynamics, we analysed six SNPs from the nonrecombining portion of the Y chromosome (NRY-DNA) characterising the NRY-haplogroups C* (M130), Q* (M242), Q1a3a* (M3), Q1a3a1 (M199), Q1a3a2 (M19), Q1a3a3 (M194) (Underhill et al., 1997; Bergen et al., 1999; Ruiz-Linares et al., 1999; Underhill et al., 2001; Seielstad et al., 2003; Karafet et al., 2008) following the nomenclature of the Y Chromosome Consortium (2002). The haplogroup Q1a3a* (formerly Q-M3) is the most frequent in Native South American males with a frequency of 77% (Bortolini et al., 2003). The haplogroups Q1a3a1, −2 and −3 downstream to Q1a3a* are specific to South America (Karafet et al., 2008). Q1a3a2 has been found mostly in indigenous populations of the northern Amazonian regions (Bortolini et al., 2003) while there is no detailed information for the distributions of both other Q1a3a groups. Haplogroup Q* upstream to Q1a3a* is the second most frequent group while C* was found only in very few indigenous South American individuals on the northern coast (Bortolini et al., 2003; Bailliet et al., 2009).

For the analysis of the six polymorphisms, we developed a multiplex SBE assay following the same principles as described for the mitochondrial Coding Region polymorphisms. Primers for the amplification are found in Table 4 and minisequencing primers in Table 3. The multiplex amplification of the six amplicons was carried out in a total volume of 50 μl containing 2.5 units AmpliTaqGold DNA hot start Polymerase, 2 mM MgCl2 (both: Applied Biosystems), 200 μM dNTP-Mix (Roche), 3.2 μl multiplex primer set (concentrations Table 4) and 1 μl BSA (20g/ml, Roche) in 10x Buffer II (Applied Biosystems). As a template we added 10 μl DNA extract and completed the final reaction volume with DNA and DNase free HPLC H2O (Merck). Afterwards, PCR success was checked by gel electrophoresis and products were purified as described before. The minisequencing reaction and all following processes equal the processes described for the mt-SBE assay.

Table 4.  Single Base Extension Primers used in the multiplex minisequencing reactions for the determination of Mitochondrial and Y-Chromosomal haplogroups
HGPolymorphismsPrimerOrientationSequence (5´−3´)Conc. (μM)

Data Analysis

All sequence data that have been obtained were manually verified with the SeqMan Software (Lasergene Software 8, DNAStar, Madison, WI, USA) to avoid analysis- and reading errors of the sequencer software (Salas et al., 2005). Alignments of the HVR1 sequences were performed using the MegAlign software (DNAStar) employing the Clustal V algorithm. The analysed individuals were assigned to mitochondrial haplogroups primarily using the coding-region data and then evaluation of consistency with the HVR1 sequence data followed.

Several statistical parameters were computed at the haplogroup- and the haplotype (sequence) level. Comparative haplogroup frequency and sequence data were obtained from contemporary populations (Ward et al., 1996; Moraga et al., 2000; Williams et al., 2002; Fuselli et al., 2003; Lewis et al., 2005, 2007b; Cabana et al., 2006; Melton et al., 2007) and pre-Columbian populations (Shimada et al., 2004; Shinoda et al., 2006; Lewis et al., 2007a; Kemp et al., 2009; Carnese et al., 2010; Fehren-Schmitz et al., 2010). To compare mtDNA haplogroup frequencies, we employed a correspondence analysis using Statistica 8 (StatSoft, Hamburg, Germany). For the haplotype/sequence data, diversity indices within populations and geographical groups (cf. Table 5) were calculated (Tajima, 1983; Nei, 1987) and biological distances between the populations and groups were estimated from mtDNA sequence data. Pairwise FST values, derived Slatkin distances for populations with short divergence times (Slatkin, 1995), Nei's genetic distance D (Nei & Li, 1979), and also the diversity indices were calculated using Arlequin 3.11 (Excoffier et al., 2005). Genetic distances between haplotypes and mean distances between groups were calculated employing the Tamura and Nei distance model with gamma correction (Tamura & Nei, 1993) with the suggested gamma value of 0.26 for mitochondrial HVR1 data (Meyer et al., 1999). To visualise genetic relationships, we performed Multidimensional Scaling (MDS) based on the distance matrices using Statistica 8 (StatSoft). Exact tests of population differentiation (Goudet et al., 1996), and an analysis of molecular variance (AMOVA) simulating different groupings, were also performed with Arlequin. All described tests were performed with at least 10,000 permutations.

Table 5.  Distribution of mtDNA haplogroups and HVR1-based Ddversity parameters of the analysed prehistoric populations and the contemporary reference populations
GroupPopulationn1H2Haplogroup FrequenciesHd3π4Reference
  1. 1n= number of typed individuals. Numbers in brackets are for individuals with only mt-haplogroup information.

  2. 2H = number of different haplotypes in population.

  3. 3Hd = haplotype diversity.

  4. 4π= nucleotide diversity.

Ancient UpperYacotogia (MH)525120.040.560.400.000.91000.0151this study
 ValleysOcoro (MH)2 (5)20.200.600.200.001.00000.0282this study
Total MH27 (30)130.070.570.370.000.91450.0158 
Pacapaccari (LIP)12 (16)60.000.690.310.000.86360.0174(Fehren-Schmitz et al., 2010)
Botigiriayocc (LIP)1290.080.420.250.250.93940.0178this study
Huayuncalla (LIP)540.200.400.400.000.90000.0215this study
Layuni (LIP)980.000.560.330.110.97220.0191this study
Total LIP38 (42)250.050.550.310.100.97300.0184 
AncientConchapata (MH)1090.290.500.140.070.97780.0227(Kemp et al., 2009)
 HighlandsChen Chen (MH)(23)n.d.0.390.390.170.04n.d.n.d.(Lewis et al., 2007b)
Huari (LIP)17120.170.220.550.060.94120.0183(Kemp et al., 2009)
Ancient Highlands (LH)(35)n.d.0.090.660.230.03n.d.n.d.(Shinoda et al., 2006)
Ancient NW-Argentina1990.110.470.000.420.91230.0177(Carnese et al., 2010)
AncientAncient North Coast (MH)(36)n.d. et al., 2004)
 Peruvian CoastAncient South Coast (EH)31 (38) et al., 2010)
Ancient South Coast (EIP)61 (65)430.020.140.300.540.98520.0140(Fehren-Schmitz et al., 2010)
NorthernChibchan8070.810. et al., 2007)
 ColombiaArawaken2940.280.280.440.000.77340.0172(Melton et al., 2007)
NorthernAncash33270.090.520.180.210.98110.0168(Lewis et al., 2005)
 PeruSan Martin21140.100.570.050.290.93330.0211(Fuselli et al., 2003)
Tupe1690.000.690.310.000.86670.0158(Lewis et al., 2007b)
Yungay36200.030.470.360.140.95400.0158(Lewis et al., 2007b)
SouthernArequipa22180.090.680.140.090.97840.0164(Fuselli et al., 2003)
 PeruPuno (Quechua)30220.070.600.230.100.97240.0152(Lewis et al., 2007b)
Puno (Aymara)14110.000.710.140.140.96700.0146(Lewis et al., 2007b)
Tayacaja59400.220.340.310.130.96550.0195(Fuselli et al., 2003)
South MiddleMapuche34 (111) et al., 2000)
 ChilePehuenche24 (105) et al., 2000)
Tierra del FuegoYaghan15 (21) et al., 2000)
AmazoniaGaviao2770. et al., 1996)
Xavante2540.160.840.000.000.67670.0082(Ward et al., 1996)
Yanomamö15560.000.560.320.120.65660.0122(Williams et al., 2002)
Zoro3090. et al., 1996)
Bolivian LowlandsBolivian Lowlands (Pool)53320.180.240.500.060.97100.0167(Bert et al., 2004)
Gran ChacoGran Chaco (Pool)204460.160.420.130.280.93710.0197(Cabana et al., 2006)


Mitochondrial DNA

From a total of 90 individuals (incl. Pacpapaccari) it was possible to reproducibly determine the mt-haplogroups of 72 individuals (65%) through the analysis of coding region polymorphisms and to obtain the mt-haplotypes (complete 388 bp HVR1 sequence) of 65 individuals (59%). The successfully analysed samples distribute over all archaeological sites that have been sampled. The preservation of mtDNA in the samples from Yacotogia, Pacapaccari, and Layuni proved to be very good. In general, if results could not be reproduced, the individuals were discarded from the data set. Throughout the study, negative extraction and negative PCR controls consistently showed negative results.

Haplogroup distribution for the total upper valley sample was as follows: 5.6% A, 55.6% B, 33.3% C, and 5.6% D. The detailed mt-haplogroup frequencies for the ancient populations analysed and the consulted reference populations, grouped into major chronological and spatial divisions, are found in Table 5. In all six populations analysed haplogroup B is predominant followed by haplogroup C. There are overall low frequencies of haplogroup A except in Ocoro and Huayuncalla (20%) but the frequencies here are biased through the very low sample size (n= 5). Haplogroup D is missing in the MH sample but present in moderate to low frequencies in the LIP individuals from Botigiriayocc and Layuni (25%, 11%). The plot of the correspondence analysis among the analysed and the ancient and contemporary reference populations (Table 5) is found in Figure 2. Two clusters are formed, one consisting of the analysed highland sites and other ancient and contemporary Central Andean populations but also the Gran Chaco and Amazonia populations. The other cluster is formed by the three ancient coastal populations from Peru and the contemporary populations of southern and southernmost Chile. The populations from Northern Columbia form a distant outgroup.

Figure 2.

Correspondence analysis plot based on mitochondrial haplogroup frequencies.

The HVR1 sequence data from the 65 successfully typed individuals could be assigned to 33 mitochondrial haplotypes (H) segregated through 39 polymorphic sites (S). All haplotypes could be assigned to one of the four Native American founding haplogroups A, B, C, and D independently confirmed through the coding region polymorphisms. A summary of the determined mt-haplotypes spanning 388 bp of the HVR1 (np 16,021–16,408) and their distribution over the different sites and chronological periods is found in Table 6. The Native American founding haplotypes (Tamm et al., 2007) are the most frequent for every haplogroup (A2 = AT-A1; B2 = AT-B1/2; C1 = AT-C1; D1 = AT-D1). There are 19 singleton haplotypes and eight typed for one site in more than one individual. Six haplotypes—four of them distinct haplotypes different from the founder haplotypes—are shared between different archaeological sites, and five (two) of them are also shared between the two time periods MH and LIP. The comparison with the whole dataset (cf. Table 5 for citations) reveals exact haplotype matches between AT-B13 and one individual from the Gran Chaco, AT-C5 and three individuals from Amazonia, AT-C7 and individuals from the Bolivian lowlands, Gran Chaco, and South Middle Chile, and between AT-C11 and one individual from Amazonia. Only the distinct haplotype AT-B12 is shared with the Early Intermediate Period (EIP) coastal population at the Andean foothills downstream of the river valleys. All obtained haplotypes were also cross-checked with the available datasets of Native American and Asian populations and the Genographic Project mitochondrial database (Behar et al., 2007). The haplotype AT-D2 is missing the haplogroup characteristic C-T transition at np16223 typical for haplogroup D. The haplogroup status was independently confirmed by coding region.

Table 6.  Mitochondrial HVR1 haplotypes and their chronological distribution observed in the ancient peruvian populations analysed in this study Thumbnail image of

The genetic diversity measures calculated from the HVR1 sequence data are found in Table 5. All analysed pre-Columbian samples from the Central Andean region show a high level of genetic diversity (Hd = MH 0.9145; LIP 0.9730) congruent with those from the analysed contemporary Andean populations (Hd = 0.8667 – 0.9811).

Interpopulation genetic distances (sequence data) were calculated on two levels: comparing the single analysed pre-Columbian populations with other ancient and contemporary Central Andean populations (Table 7) and comparing ancient and contemporary populations grouped by chronological and major geographical origin on the continental scale (Table 8). Different distance calculations have been performed showing concordant results (not all data shown). All sampled archaeological sites exhibit a low genetic distance between each other and the ancient and modern Central Andean populations. An exact test of differentiation between each pair of populations revealed that most distances between the analysed upper valley sites and the other populations are not significant (significance level: 0.0500) except with the ancient populations from the south coast, e.g., Yacotogia to Ancient South Coast—EIP (P= 0.00098 ± 0.0010 SD). However, it has to be considered that the nonsignificance especially for sites like Ocoro, Huayuncalla, and Layuni is due to small sample size (cf. Table 5). An MDS plot employing the pairwise FST matrix of the group comparison (Table 8) is shown in Figure 3. The plot has a reasonable stress level of 0.082, falling below the upper bound of 0.199 (Sturrock & Rocha, 2000), indicating that the plot is an accurate representation of these data with the original distance matrix. As in the single site comparison there is a very low and nonsignificant genetic distance between the pooled MH and LIP populations (pairwise FST= 0.0018, P= 0.32617 ± 0.0140 SD) and low distances between the ancient and the contemporary Central Andean populations, except with the ancient coastal populations (Table 8). These populations exhibit the lowest distances to the contemporary populations of Southern Chile and Tierra del Fuego but only the latter cluster together with the ancient coastal populations in the MDS plot (Fig. 3).

Table 7.  Pairwise FST values from HVR1 sequence data between each single pair of ancient and modern peruvian populations
  1. *FSTP-value indicates that genetic distance in the comparison is not statistically significant.

1 Layuni (LIP)                 
2 Botigiriayocc (LIP)0.0000*                
3 Pacapaccari (LIP)0.0062*0.0399*               
4 Huari (LIP)0.0593*0.0069*0.1399              
5 Yacotogia (MH)0.0000*0.0054*0.00590.0747*             
6 Conchapata (MH)0.0000*0.0000*0.0000*0.0550*0.0000*            
7 Ocorro (MH)0.0000*0.0000*0.0000*0.0747*0.0468*0.0000*           
8 Huayuncalla (LIP)0.0005*0.0000*0.0000*0.0000*0.0247*0.0000*0.0000*          
9 Arequipa0.0737*0.0702*0.12890.18020.10450.0617*0.0374*0.1346*         
10 San Martin0.14910.0712*0.25070.15410.23460.10980.0000*0.1059*0.1212        
11 Tayacaja0.09730.0196*0.19140.06840.14750.07180.0000*0.0524*0.07360.0177*       
12 Ancash0.0409*0.0084*0.12430.07430.08060.0338*0.0000*0.0589*0.0079*0.06640.0137      
13 Puno (Qechua)0.0553*0.0356*0.10240.08840.06140.0525*0.0622*0.0710*0.0160*0.13120.06590.0057*     
14 Yungay0.0431*0.0129*0.14670.05750.08010.07330.1142*0.0851*0.05160.11190.04020.0057*0.0308*    
15 Tupe0.0693*0.0808*0.0885*0.13460.0723*0.0857*0.1498*0.1193*0.0286*0.19090.12050.0391*0.0133*0.0453*   
16 Puno (Aymara)0.0496*0.0434*0.13450.12140.09350.0593*0.0684*0.0892*0.01350.13250.06950.0000*0.0000*0.0457*0.0410*  
17 Ancient South0.46280.23330.70790.21250.52710.45020.53720.41480.53730.24250.14560.28330.43300.26840.61960.4831 
 Coast (EH)
18 Ancient South0.19960.07570.37620.06790.26230.22710.25720.17520.30170.16030.08110.14650.23150.12160.30910.24510.0237
 Coast (EIP)
Table 8.  Pairwise FST values from HVR1 sequence data between each pair of chronologically and geographically grouped population
  1. * FSTP-value indicates that genetic distance in the comparison is not statistically significant.

1 North Peru            
2 South Peru0.0000*           
3 Ancient Peruvian Highlands (LIP)0.04550.0514          
4 Ancient Peruvian Highlands (MH)0.06460.06900.0018*         
5 Bolivian Lowlands0.07750.08890.03270.0936        
6 Gran Chaco0.06320.05480.04290.04900.0966       
7 South Middle Chile0.05490.06280.07000.12470.05240.0861      
8 Tierra del Fuego0.20600.23470.19950.31570.12640.20670.0802     
9 Amazonia0.02770.02980.01960.02950.08360.04350.07560.2507    
10 Northern Colombia0.32340.29800.38000.47260.28360.28890.34060.49800.4142   
11 Ancient South Coast (EH)0.24990.26360.28340.47140.18290.21020.17690.12050.30020.4052  
12 Ancient South Coast (EIP)0.13810.15650.13780.25580.06680.13800.08660.06260.16020.38050.0228* 
13 Ancient NW-Argentina0.10140.10250.10570.11500.19530.07210.17020.31430.09510.44840.37630.2410
Figure 3.

MDS plot based on pairwise FST values derived from mitochondrial HVR1 sequences. Raw stress was 0.082. NC = Northern Columbia; AH–MH = Ancient Highlands –Middle Horizon; AH–LIP = Ancient Highlands–Late Intermediate Period; NP = North Peru; SP = South Peru; GC = Gran Chaco; AA–EIP = Ancient Argentina; LB = Bolivian lowlands; SC = southern Chile; TdF = Tierra del Fuego; ASC–EIP = Ancient South Coast–Early Intermediate Period; ASC–EH = Ancient South Coast–Early Horizon.

The AMOVA calculations show lower variation between the ancient highland groups and contemporary Peruvian groups than between the ancient coastal groups and the ancient highlands and contemporary Peruvians (Table 9) When the contemporary Peruvian datasets and the ancient highland datasets in aggregate are compared to the Amazonian datasets in aggregate, the percentage between them is near zero and ΦCT is nonsignificant. The percentage between ancient highlands and the aggregated populations of southern and southernmost Chile is only slightly higher than between the ancient coastal populations and the Chileans. However the associated ΦCT estimate for the ancient coast and Chile comparison is nonsignificant (P= 0.11327 ± 0.00280) after 10,000 permutations.

Table 9.  AMOVA percentage of variation and fixation indexes
ComparisonMolecular variation (%)Fixation index
Variation between groupsVariation among populations within groupsVariation within populationsΦCTΦSCΦST
  1. * Fixation indexes found to be not-significant (P values < 0.05) after 10.000 permutations

Ancient Highlands to Ancient Coast17.612.6879.710.1760.032*0.203
Ancient Highlands to Peru5.083.6791.250.0510.0370.087
Ancient Coast to Peru13.404.0482.560.1340.0470.174
Ancient Highlands to Southern Chile9.232.4588.320.0920.027*0.117
Peru to Southern Chile6.134.2789.600.0610.0460.104
Ancient Coast to Southern Chile6.834.0189.160.068*0.0430.108
Ancient Highlands to Amazonia−3.8719.4784.40−0.039*0.1870.156
Peru to Amazonia−1.2214.3886.83−0.012*0.1420.132
Ancient Coast to Amazonia8.6619.6771.680.087*0.2150.283

Y-Chromosomal DNA

It was possible to reproducibly determine Y-chromosomal haplogroups for 19 individuals (Pacapaccari = 5; Botigiriayocc = 5; Layuni = 3, Yacotogia = 6). All individuals belong to haplogroup Q1a3a*. Only individuals where it was possible to determine the full profile of six SNPs are considered here. There is a high number of individuals showing allelic dropout, presumably due to DNA degradation, for one or more SNPs. For two individuals with realised polymorphisms in M242 and M3, only M19 could not be typed, so there is a chance that these individuals could belong to haplogroup Q1a3a2 and not Q1a3a*.


DNA preservation for most studied sites proved to be good. This is consistent with other aDNA studies on Andean highland populations and differs from data obtained at soil burials from coastal sites. The relatively high success rate of DNA extraction was likely related to the burial environment. Most samples derive from caves and Chullpas lined of stone with no soil contact and stable microclimatic conditions (humidity and low temperatures). Such cave-like burial conditions proved to be best for DNA preservation (Burger et al., 1999). The samples from Ocoro derive from a collapsed Chullpa. Samples were covered by soil and were not imbedded in the microclimatic conditions described above, explaining the lower success rate.

The MH and LIP populations from the upper valley cannot be statistically differentiated at the mitochondrial haplogroup or haplotype level. At the haplogroup level the only diachronic difference can be seen in the lack of haplogroup D in the MH. But since it occurs in LIP also only in a very low frequency (10%) this difference could be due to sampling artefacts or may be biased due to the low sample size. On the haplotype level distances are very low and nonsignificant and MH and LIP sites share unique haplotypes. Thus, the MH populations cannot be excluded as being matrilineally directly ancestral to the LIP populations. Since all individuals successfully typed for Y-chromosomal haplogroups belong to Q1a3a* also no patrilineal discontinuity can be observed. When compared to other populations of similar antiquity in the Andes from the Ayacucho Basin (Kemp et al., 2009), Chen Chen in South Peru (Lewis et al., 2007a), and the Cusco area (Shinoda et al., 2006) and modern Peruvian populations on the haplogroup level (cf. Table 5), homogeneity of the Central Andean populations at least from the MH on can be suggested. This conclusion is validated through the interpopulation comparison of HVR1 sequence data with ancient and modern Peruvian populations mostly showing low FST values (cf. Tables 7 & 8). For the low genetic distances between ancient and modern populations genetic drift cannot be rejected as causal (Lewis, 2009a). Since there are no earlier genetic dates from the highlands, preceding population dynamic events or population replacements cannot be disproved. The mitochondrial data can be interpreted as evidence for spatial matrilineal continuity throughout the Andes, including the upper valley of the western slopes, from the MH to modern times as suggested before (Lewis et al., 2007a; Kemp et al., 2009; Lewis, 2009a), although it is possible that the observed homogeneity throughout the Andes makes it impossible to detect discontinuities without much higher sample sizes and higher resolution genetic data.

The upper valley populations dating to the MH and LIP differ significantly from the EH and EIP coastal populations (Fehren-Schmitz et al., 2010) on mitochondrial haplogroup and haplotype level (cf. Tables 5 & 7) even if there is only a low geographic distance of 50 km. AMOVA calculations show a variation between the groups of 17.61%. Even though there is evidence from archaeology for strong cultural links between both areas (Reindel & Gruen, 2006; Reindel, 2009a, 2009b) population biological exchange and accordingly gene flow seems to be marginal, but it has to be taken into account that this assumption results from an asynchronous comparison. If we assume coastal and highland populations as distinct according to the major cultural zones, the pre-Columbian populations of the upper valleys belong to the highland sphere of population biological influence. There are no diachronic changes accompanying the cultural changeover in the upper valley research area from Wari highland influence in the MH to coastal influence in the LIP that can be seen in the mitochondrial data. However, to date there is no genetic data for the LIP and only insufficient data for the MH from the coast, so there is no opportunity for direct synchronous comparisons. Since the comparisons of the MH and LIP highland populations to modern Peruvian populations show only low genetic distances (Table 7) and there is no detectable genetic distinction of modern Peruvian coastal and highland populations there must have been population dynamic processes that led to the homogenisation of the Central Andean population starting at the earliest in the LIP. A possible factor could be the expansions of the vast reaching highland empires, primarily the Incas. However, to validate such hypotheses more genetic data from coastal populations dating to the LIP and LH would be needed.

A possible model of explanation for the phenomenon that the upper valley populations show no signs of intermixture with the proximate coastal populations, even if there were migrations from the coast to the highlands in the late EIP, can be based on physical stressors affecting unadapted humans at altitudes above 2500 m (Fehren-Schmitz et al., 2009). There is evidence from historical sources and high altitude medicine that unadapted women coming to the highlands have a significant higher chance of stillbirth than women adapted to high altitude habitats (Moran, 2000; Moore et al., 2004; Gonzales et al., 2008). Even if the coastal women adapt to the physical stressors in the highlands after a period of time it could be suggested that they have a quantitative reproductive disadvantage compared to the women living in the highlands since birth (Gonzales & Tapia, 2009). Thus, the demographic maternal influence on the populations above 2500 m is too low to have a significant impact on the mitochondrial haplogroup frequencies. But to validate this hypothesis exact numerical data on the possible reproductive disadvantage and comparable demographic data for the pre-Columbian coastal and upper valley populations would be needed to do statistical simulations for the possible scenarios. Neither, exact information regarding the reproductive chance of unadapted women in high altitude habitats, nor exact demographic information is accessible at the moment.

When compared on the continental scale at the haplogroup level the ancient coastal populations show a high affinity—high frequencies of haplogroup D and low frequencies of haplogroup B—to modern indigenous populations of southern Chile and Tierra del Fuego (Moraga et al., 2000). This could be interpreted in terms of the hypothesis that the South American West Coast and the Andean highlands have a differential colonization history (Rothhammer et al., 2001). Rothhammer and colleagues postulated that the Andean Highlands might have been colonized from the eastern Amazonian regions, while the coastal populations derive from another route directly following the coastline. When Amova and FST (cf. Tables 8 & 9) calculations based on sequence data are considered the ancient and modern Andean populations exhibit a higher affinity to the modern Amazonian populations (FST = 0.0196–0.0298) than to the ancient coastal populations (FST = 0.2499–0.4714). However, the variation between modern Peru and the Chilean populations and ancient Coast and the latter are nearly exact (Table 8) as are the pairwise FST values between ancient coast and south-middle Chile and between modern Peru and the latter. Thus on the haplotype level there is little or no higher affinity of the ancient coastal populations to the Chilean populations than in the modern Peru-Chile comparison. The higher affinity in the Central Andean-Amazonia comparison is likely to be biased through the highly heterogenic character of the employed Amazonian population pool. The single populations integrated in the pool are geographically and culturally distinct and show significant genetic differentiation if compared to each other (Ward et al., 1996; Williams et al., 2002). On the basis of the existing data a distinct colonization of the Andean Highlands and the Pacific Coast cannot be validated as model of explanation for the genetic differentiation of our coastal and upper valley populations. However, the route along the coast is supported through the distribution of the mt-haplotype D4h3 found only in western South America in Ecuador and coastal north Peru and the indigenous populations of southern Chile (Perego et al., 2009) similar to the observed haplogroup patterns described before. That the differentiation between coast and highlands is not only a local phenomenon is also supported through ancient DNA data from a MH population of the Peruvian north coast (Shimada et al., 2004) showing similar haplogroup frequencies and clustering together with the south coast populations in the correspondence analysis (Fig. 2). Unfortunately only mitochondrial haplogroup frequencies are known for this site. On the other hand genome wide STR comparison between western and eastern South America are inconsistent with an initial west coast only migration into the continent (Lewis, 2009b) correlating with a possible colonisation of the Andean region from the east. The encountered genetic variability in the continent can be best explained with a single founding population (Fuselli et al., 2003; Lewis et al., 2007b; Wang et al., 2007; Lewis & Long, 2008). The previously formulated possibility that western and eastern populations result from two founding populations entering South America separately and at different times (Tarazona-Santos et al., 2001) can be rejected on the basis of recent data. Populations from eastern and western South America show nearly similar degrees of genetic variability and the overall variability is too low to be explained by several migrational waves (Lewis & Long, 2008; Lewis, 2009b). But problems for the evaluation of population dynamic processes can also result from the simplified geographical and cultural assignments of populations to groups, as in the simplified east-west distinction. For the east Rothhammer et al. (2001) already doubted that the populations form a cohesive group. The results of this study suggest that the west has not always been as genetically homogenous as it is seen to be in modern central Andean populations. Therefore it has to be taken into account that results from studies comparing modern indigenous populations from South America to reveal the initial processes of peopling are biased through prehistoric population dynamic processes resulting in significant palimpsest effects. Thus, the higher affinity of pre-Columbian west coast populations and the distinction to ancient highland and modern Central Andean populations must be founded in earlier population dynamic processes, e.g., the initial peopling, and the modern overall homogeneity observed in Central Andean populations (coast and highlands) is most likely to be a product of later cultural and political processes like the expansions of the great highland empires (Shinoda et al., 2006; Fehren-Schmitz et al., 2009). Correlating the genetic information and the archaeological data, a scenario suggesting a single population entering the continent from the Isthmus of Panama and then splitting up into two or more main routes is conceivable. One group might have followed a route along the west coast, and another moved along the north coast and then into the Amazonian Basin or maybe along the eastern slopes of the Andes, subsequently peopling the Andean highlands. This model, similar to the one suggested by Rothhammer et al. (2001), seems plausible but is statistically incapable of proof on the basis of recent data.


Based on the genetic comparisons there are no signs for matrilineal and patrilineal discontinuities in the upper valley populations investigated in this study. Cultural dynamic processes and the change of cultural influence spheres in the transition from MH to LIP did not visibly affect the populations living in the fringe area between the coast and the highlands. The examined populations group together with ancient and modern Central Andean highland populations. At all the diachronic comparisons suggest a very homogenous picture of genetic variability in the highland populations. The significant difference to the ancient coastal populations cannot be satisfactorily explained to date. Models are only based on plausibility but there is no chance for statistical validation. Overall it can be stated that the cultural and demographic history of South America is too complex to be reconstructed only from modern indigenous populations. More aDNA data from populations dating to the earliest periods of human presence in South America deriving from coastal and highland sites are needed to get a gapless diachronic portrait of genetic variability. With the existence of this data it would be possible to perform genetic modelling under consideration of diverse demographic scenarios (Anderson et al., 2005) helping to make definite decisions regarding the processes involved in the peopling of the continent and the impact of subsequent culturally and ecologically dynamic processes on human genetic distribution patterns.


The authors greatly thank the German Federal Ministry of Education and Research for funding (Grant number: 01UA0804B) our research and all Peruvian and German colleagues in the research group “Andentransekt – Climate sensitivity of pre-Columbian Men-Environment-Systems” for the scientific collaboration, as well as Mr. Matthew Rector for proofreading the manuscript. We also thank the Peruvian Institute for Cultural Heritage (Instituto Nacional de Cultura del Perú) for granting us the exportation permission for the sample material (Resolución Directoral Nacional No. 1346).