Molecular genetic techniques and the data that they produce have the ability to inform us about numerous aspects of our evolutionary history. Molecular analyses can shed light on phylogeny, demographic history, the history of selective forces, and even anatomy and physiology that are not preserved in the hard tissues of the fossil record.
When Darwin (1859, p. 490) first stated that “Light will be thrown on the origin of man and his history” near the end of the Origin of Species, the only two lines of evidence toward this end were comparative anatomy and paleontology.
The only hominin fossils known during the 19th century belonged to Neanderthals, early modern humans, and Homo erectus. The debates around them centered on if they were pathological humans, related to apes, and what they might possibly tell us about the origin of our own species. Many early anatomists and paleontologists initially denied that these fossils were even related to humans (Trinkaus and Shipman,1993). The discovery of numerous robust large-brained skulls with large brow ridges in the first decades of the 20th century, confusingly referred to as “archaic Homo sapiens” and now more often referred to as H. heidelbergensis (Schoetensack,1908), led to a variety of scenarios as to how we were related to Neanderthals and H. erectus (Trinkaus and Shipman,1993).
For the first part of the 20th century, modern human variation and therefore different groups of modern humans were tied directly to the fossil record. Various researchers proposed an anagenetic series with H. erectus giving rise to Neanderthals and then modern humans, at least in Europe, laying the groundwork for what became known as the regional continuity or more commonly the multiregional model of human evolution (MRE). H. erectus in Asia was proposed to have evolved into modern Asians, whereas middle Pleistocene hominins in Africa similarly gave rise to modern Africans.
“Those facts, together with the morphological evidences, suggest that man has evolved in different parts of the old world. The Australian natives have some of their characteristics in common with the fossil Wadjak-Keilor man and with Homo soloensis. Homo soloensis himself appears as an advanced Pithecanthropus phase. Some of the characteristic features of Sinathropus reappear in certain Mongolian groups of today. The same relation exists between Rhodesian man and certain fossil South African forms of modern man. The Skhūl group of Palestine presents forms intermediate between the typical Neanderthal man from Tabūn and fossil modern man from Europe.” (Weidenreich,1947, p. 201)
Starting in the 1920s and 1930s, a growing collection of more distantly related fossil hominins including Australopithecus africanus and A. robustus made it clear that human evolutionary history extended much more deeply into the past (Dart,1925; Broom,1938). In the 1950s and 1960s, numerous hominins of these and other older lineages were discovered in East and South Africa. For several decades thereafter, the emphasis of research on human evolution shifted to Africa and to the discovery, description, and analysis of Plio-Pleistocene hominins including several species of Australopithecus, Paranthropus, and early members of the genus Homo including H. habilis and H. rudolfensis (Leakey et al.,1964; Walker and Leakey,1978; Kimbel et al.,1984; Brown et al.,1985; Leakey et al.,1995; Leakey and Walker,1997).
Interest in modern human origins continued among morphologically focused researchers in the 1970s and 1980s. Although Weidenreich (1947) posited regional continuity of traits from H. erectus and regional archaic populations, he did recognize the necessity of gene flow between the different regions to maintain the unity of the modern human species. Coon (1962) among others downplayed the gene flow between regions and instead saw nearly independent origins of the various regions of the world from H. erectus and regional archaics. Although this is clearly an untenable model of evolution (Dobzhansky,1963), more realistic models have been put forth incorporating various levels of gene flow between regions (Wolpoff et al.,1984; Frayer et al.,1993).
In contrast to MRE, Howells (1976) proposed the “Noah's Ark” hypothesis, now often referred to as the Recent African Origin (RAO) model.
“I do not use ‘Noah's Ark’ in derision, but only to label the pattern starkly: a single origin, outward migration of separate stirps, like the sons of Noah, and an empty world to occupy, with no significant threat of adulteration by other gene pools or even evaporating gene puddles.” (Howells,1976, p. 480)
Stringer and Andrews (1988) elaborated on this model by integrating paleontological and the then growing body of genetic data, which posited very little interbreeding between the modern human migrants out of Africa and regional archaics. Intermediate models such as the African hybridization and replacement model proposed by Braüer (1982) predict a relatively large amount of interbreeding between regional archaics and the wave of anatomically modern African migrants. Smith et al. (2005) proposed a scenario with changing selection pressures, gene flow, and admixture between archaics and migrating modern Africans often referred to as the assimilation model. Aiello (1993) provided a good summary of these various models.
These models predict that different patterns of morphological and genetic variation should be found in different regions of the world. The RAO model predicts that the very earliest anatomically modern fossils would first appear in Africa and archaic populations would be found throughout Eurasia, descendants of H. erectus or later archaics. Outside of Africa, there should be few alleles older than the origins of anatomically modern humans found in Eurasia, and a clinal decrease in genetic diversity should emanate from Africa. The MRE and other models that posit regional continuity would predict various levels of ancient alleles found in different regions along with both transitional fossil forms and levels of morphological continuity in the major areas of the Old World.
INFERRING THE ORIGINS OF MODERN HUMANS FROM CURRENT GENETIC DIVERSITY
The use of genetics to investigate human origins began early in the 20th century, shortly after the discovery of the ABO blood group system. Nuttal (1904) concluded that the cross reactivity of blood from different individuals could be used to infer their genetic relationships. Despite significant work carried out by various researchers attempting to use blood groups to infer the relationships of different human groups, it was not until proteins (in the 1960s) and DNA itself (1970s and 1980s) could be directly analyzed that significant progress was made (Disotell,2000b). Cavalli-Sforza and Edwards (1964) estimated genetic distances between five populations based on five different blood groups and inferred a tree in which Europeans were separated from an Afro-Asian lineage. This fit with the paleontological view at the time given that the earliest known modern humans were the Cro-Magnon fossils from western Europe, which were thought to be 35,000 years old (Coon,1962; Klein,2009). A much larger study by Nei and Roychoudhury (1974) yielded a more prescient result. Although the gene frequencies for 21 blood group systems yielded ambiguous results, the analysis of 35 proteins linked Europeans and Asians to the exclusion of Africans. More interestingly, they estimated that Europeans and Asians split from each other ∼55,000 years ago while they diverged from Africans around 120,000 years ago (Nei and Roychoudhury,1974).
Perhaps the most influential genetic study of human origins was carried out by Cann et al. (1987) examining the maternally inherited mitochondrial DNA (mtDNA) of 147 individuals representing the major populations from around the world. A maximum parsimony tree inferred from the 135 unique mtDNA types (haplotypes) had its deepest roots within a group of individuals of African descent. Molecular clock estimates placed the deepest split in the tree just less than 200,000 years ago. The non-African lineages were inferred to be just more than 50,000 years old. The implication of this study was that any archaic population outside of Africa older than 50,000 years old did not leave mtDNA behind in contemporary populations. This study thus laid the groundwork for a 25-year debate over the MRE and RAO hypotheses.
The study by Cann et al. (1987) was criticized on numerous grounds, including its use of high-resolution restriction mapping, the use of African Americans to represent African populations, and the maximum parsimony analysis itself, especially its use of midpoint rooting. A subsequent study by Vigilant et al. (1991) reached essentially the same conclusion after carrying out DNA sequence analysis with sub-Saharan African samples and using outgroup rooting to the chimpanzee D-loop sequences. Although no phylogenetic analysis of so many individuals can be guaranteed to find the most parsimonious tree, a strong argument for the African origin of all contemporary mtDNA is in its pattern of world-wide variation. The majority of world-wide mtDNA variation is found within Africa with less than half of the total found outside of sub-Saharan Africa (Ingman et al.,2000). When median-joining networks, which relax parsimony to allow limited homoplasy, are inferred from large samples of mtDNA, six major haplogroups (closely related lineages of mtDNA haplotypes) are found (Bandelt et al.,1999). All six of these mtDNA haplogroups are found within the sub-Saharan African samples and encompass all of modern human diversity (Kivisild et al.,1999). All of Eurasia and the New World mtDNAs are derived from just one of these African haplogroups within the last 50,000–60,000 years (Kivisild et al.,1999).
Whole human mtDNA genomes, beginning with the first one (a European composite) characterized by Anderson et al. (1981), a Japanese and an African individual (Horai et al.,1995), followed by a world-wide sample of 53 people (Ingman et al.,2000), now form the basis for many analyses. As of 2012, more than 18,000 whole human mtDNA genomes have been published and all of them fall within the major haplogroups defined by previous studies, that is, they all coalesce within the last 200,000 years with the non-African lineages only 50,000–60,000 years old or younger. Therefore, any hominin lineage that branched off from the modern human species prior to 200,000 years ago, did not leave any mtDNA present today.
The lack of ancient mtDNA in modern humans today does not necessarily mean that archaic DNA is not present in the human genome. Scenarios in which all archaic mtDNA lineages were lost due to genetic drift or were replaced by more recent ones are indeed possible (Relethford,2001). Currat and Excoffier (2004) have modeled the modern human expansion into Europe and their potential interbreeding with Neanderthals and concluded that the lack of archaic mtDNA in the modern population is compatible with at most 120 hybridization events over a 12,000-year period of potential contact (i.e., one per 100 years).
mtDNAs present a special case for inferring past population history. Because of its strictly maternal pattern of inheritance, mtDNA has an effective population size one-quarter as large as that of a typical autosome (see Table 1 for a glossary of terms). The Y chromosome presents a similar situation. In fact, given the greater variation in male reproductive success, the Y chromosome has a slightly smaller effective population size than mtDNA. The X chromosome on the other hand has an effective population size three times larger than mtDNA or Y chromosomes and three-fourths of that of a typical autosome. The easiest way to think of it is that a male has only one maternal grandmother from which to inherit his mtDNA, one paternal grandfather for his Y chromosome, three possible grandparents for his X chromosome, and four grandparents for any one single autosome. The larger the effective population size of a locus, the further back in time it will coalesce to a common ancestor, that is, a threefold increase in effective population size will yield a coalescence time three times older for the X chromosome than for the Y. Another implication of this disparity is that it is much more likely for mtDNA and Y chromosome haplotypes to go extinct or be replaced because of their relatively small numbers.
Table 1. Glossary
One of two or more versions of a gene
The non-sex chromosomes (there are 22 pairs in humans)
A noncoding fast-evolving region of the mitochondrial genome containing the D-loop with two hypervariable regions (HVRI and HVRII)
Displacement loop, a portion of the noncoding control region of the mitochondrial genome, often (mistakenly) referred to as the control region
The removal of an amine from a molecule, can spontaneously change the nucleotide cytosine to uracil among other transformations
Effective population size (Ne)
The estimated number of actually reproducing individuals in a population
The study of the structure/or and function of genes
The entire complement of an individual's heredity material including both coding and noncoding DNA (the human genome consists of ∼3.2 billion bases)
The study of all or large portions of organisms' genomes
The specific combination of alleles at a particular genetic locus
Hypervariable region 1, an ∼550-base pair stretch of the control region often used to differentiate individuals within populations
A mutation resulting in one or more additional nucleotides (the opposite of a deletion)
Last common ancestor, usually used to refer to the common ancestor of two more more species
The site on a chromosome where a gene or DNA region is located
Short RNA molecules (average of 22 nucleotides) that bind to complementary sequences and have a regulatory function
Most recent common ancestor, the most recent individual from which a group of members within a species share descent
Small maternally inherited circular genome found in the mitochondria of eukaryotic cells (∼16,500 long in humans)
Nuclear mitochondrial pseudogenes, a portion of the mitochondrial genome that has transferred into the nuclear genome as a pseudogene
Primer extension capture, a method to enrich DNA samples for regions of interest prior to high-throughput sequencing
A nonfunctional gene or copy of a gene that no longer is expressed as a protein
Single-nucleotide polymorphism, variant at a single nucleotide position
A nucleotide substitution between purines (A or G) or between pyrimidines (C or T)
A nucleotide substitution between a purine and a pyrimidine
Change between one nucleotide and another, often called a mutation
Although far fewer modern Y chromosomes have been sampled than mtDNA, studies have universally found that all modern Y chromosomes can be traced relatively recently to Africa. Hammer et al. (1998) found a world-wide coalescence of the ancestor of all Y chromosomes in Africa ∼150,000 years ago. Underhill et al. (2000) inferred an expansion out of Africa around 44,000 years ago also based on Y chromosome data. The slightly younger dates than those inferred from mtDNA fit quite nicely given the slightly smaller effective population size of the Y chromosome. In any case, these recent dates suggest that as with mtDNA, archaic populations outside of Africa did not leave a Y chromosome signature in the modern human gene pool.
Many other loci and genetic systems have been characterized in modern humans to infer their history. An interesting case is a 10,000-base pair segment of the X chromosome that Kaessmann et al. (1999) sequenced in 69 individuals from around the world. The time estimated to the most recent common ancestor (MRCA) for this noncoding segment of DNA is 535,000 ± 119,000 years, which is approximately three times greater than the dates estimated for mtDNA and Y chromosome MRCAs (Kaessmann et al.,1999). This squares nearly perfectly with those estimates after taking into account effective population size. Similarly, a 3,000-base pair region of the β-globin gene located on an autosome yielded an estimated value of 750,000 years or about four times that of mtDNA and Y chromosome estimates for its MRCA (Harding et al.,1997).
A few studies have proposed that non-African alleles for several loci are more ancient than those found in Africa and concluded that they must have arisen in archaic Eurasian populations and spread back into Africa. Harris and Hey (1999) estimate 1.86 million-year ancestry for modern human populations based on the PDHA1 gene. This particular analysis has been strongly criticized by others due to the fact that the locus is known to be under fairly strong selection and that the sampling of Africans was rather incomplete (Disotell,1999; Seielstad et al.,1999). In any case, the variance around estimates of MRCAs is very large, and these putatively ancient alleles may just fall in the tail of the distribution expected. Templeton (2002) has applied “nested clade analysis” to infer multiple expansions out of Africa with recurrent gene flow between more ancient and archaic lineages and more recent ones using multiple loci. However, Panchal and Beaumont (2010) found serious flaws with nested clade analyses, calling into question the inferences drawn by Templeton.
Several authors have modeled modern human diversity and suggest that there may be evidence for archaic introgression in the modern genome. Plagnol and Wall (2006) generated a reasonable model to describe recent African and European evolutionary history that included a limited amount of gene flow between the two populations even after the out of Africa bottleneck. They found that their model cannot explain the pattern of modern polymorphisms without ∼5% admixture in the European population from an archaic population (presumably Neanderthals). They do admit that other scenarios could also explain the patterns of variation that they observed (Plagnol and Wall,2006). Hammer et al. (2011) also focusing on patterns of polymorphisms within Africa concluded that up to 2% of the alleles in some African populations are derived from archaic African populations (clearly not Neanderthals) that introgressed into modern African populations around 35,000 years ago. A more recent study that sequenced 15 whole genomes from Pygmy and click-speaking Hadza and Sandawe hunter-gather populations in Africa infers evidence of archaic introgression into all three populations (Lachance et al., 2012). The archaic population is purely theoretical at this point as no plausible fossil has been put forth. By focusing on runs of single-nucleotide polymorphisms (SNPs) forming long haplotypes in the new genomes and estimating each such haplotypes' time to MRCA, they detected signals of introgression into these populations of ancient alleles. These ancient alleles are inferred to have split from the human lineage ∼1.2 million years ago, similar to the estimated split of Neanderthal and Denisovan alleles from modern humans. These putatively archaic alleles are inferred to have introgressed into the modern hunter-gatherers 20,000–80,000 years ago (Lachance et al., 2012). The pattern of shared inferred archaic alleles suggests that the introgression either occurred in the common ancestor of the three populations or that there has been gene flow between them since their original split.
Nevertheless, it is clear that human diversity originally arose in Africa and that the bulk of modern non-African diversity is recently derived from an African source. The exact timing and pathway(s) out of Africa are still under debate. Did an African population or multiple populations exit via the Sinai into the western Asia or did they cross the Bab al Mandab into the southern Arabian peninsula, or both? Several recent studies have proposed that a southern route is likely at least for populations heading toward Asia (Macaulay et al.,2005). Armitage et al. (2011) proposed a much earlier exit prior to 75,000 years ago through the southern Arabian peninsular route. However, this is based solely on stone tools whose makers are unidentifiable. A similar proposal based on the analysis of a complete Australian aboriginal genome concluded that an early rapid dispersal between 62,000 and 75,000 years ago was followed by a later dispersal of the populations that would give rise to the majority of East and South Asians (Rasmussen et al.,2011). This analysis, however, does not take into account later Neolithic movements of people in Europe and Asia that would not have affected Oceanic populations. Such later movements and gene flow would make the former two populations look more similar to each other and therefore may provide a false signal of two separate migrations.
A majority of studies suggest ∼50,000-year-old migration out of Africa, perhaps via the southern route. All of these contemporary studies share in common the fact that they extrapolate backward in time from modern patterns of variation. The ability to sample ancient DNA (aDNA) extracted from fossils opens up another exciting avenue of inquiry into the fate of older lineages.
ANCIENT HOMININ DNA
The first aDNA sequences generated in 1984 were from a museum specimen of a quagga, a zebra-like species that went extinct in the 19th century (Higuchi et al.,1984). Standard DNA cloning techniques were used to generate 229 bases of mtDNA. The following year, a 2,400-year-old Egyptian mummy similarly yielded DNA amenable to recombinant cloning techniques (Pääbo,1985). These earliest attempts at sequencing aDNA required extremely well-preserved tissues and therefore worked best with relatively recent museum specimens.
The advent of polymerase chain reaction (PCR) techniques (Mullis,1990), which can amplify as little as a single molecule of DNA into billions of copies in a matter of hours, led to new attempts to sequence even more aDNA in the second generation of aDNA studies. In 1997, one of the original Neanderthal specimens, the 40,000-year-old humerus of Feldhofer 1, found in the Neander Valley of Germany for which Neanderthals are named, yielded 379 bases of mtDNA from the hypervariable region I (HVRI) of the control region or D-loop (Krings et al.,1997). Because of the degraded nature of the DNA, 13 different very short amplifications and 123 clones derived from PCR products were required to produce the Neanderthal sequence. Many clones produced slightly discordant sequences so that a consensus for every base was generated based on the majority of the clones.
This first sequence generated via PCR demonstrated several of the difficulties in sequencing aDNA. First, aDNA usually consists of relatively to highly degraded short fragments, often less than 100 bases long (Green et al.,2008). Second, the DNA strands often contain errors, especially near the ends of the short molecules. Deamination of cytosine (C) its conversion to uracil (U) is particularly common; because uracil acts like a thymine in later reactions, deamination mimicks a C to T transition where none actually occurred. Guanine (G) to adenine (A) substitutions are also common. Third, DNA is often contaminated with exogenous DNA from multiple sources including the original excavators, researchers who have handled the material over the years, laboratory personnel involved in amplification and sequencing the sample, and microbes and other organisms in the soil in which the fossil was buried.
After stringent controls, including replication in another laboratory, the 379-base Feldhofer sequence was compared with modern human sequences. There were 27 differences with 24 transitions (common substitutions within a nucleotide class), two transversions (much rarer substitutions between classes), and an insertion when compared with the human reference sequence (Anderson et al.,1981). When compared with more than 2,000 human sequences and 59 chimpanzee sequences, the types of changes and positions are of the predicted evolutionary pattern, further suggesting that the sequence is indeed a genuine Neanderthal sequence. The human sequences varied amongst themselves by an average of 8.0 ± 3.1 substitutions, whereas there were 27.2 ± 2.2 differences between humans and the Neanderthals. Interestingly, the Neanderthal sequences were no more closely related to contemporary Europeans than any other populations. An inferred phylogenetic tree showed that the Neanderthal diverged before the divergence of any living human population. Using a human and chimpanzee divergence date of 4–5 million years (probably about 1 million years too recent), a date of 550,000–690,000 years before present was estimated for the Neanderthal–human split (Krings et al.,1997).
Criticisms of this initial study and its conclusions poured forth. These included questions about the degree of contamination given the fact that modern DNA was found by the researchers and that it only represented a single individual. However, it also included an especially valid critique of the modeling for inbreeding (Nordborg,1998). The study demonstrated that although random mating of Neanderthals and modern humans could be ruled out, other scenarios of inbreeding were possible. Nordborg (1998) cautioned that without additional Neanderthal mtDNA sequences and information from other loci gathered, one could not infer that no inbreeding occurred. Wolpoff (1998) and Gutiérrez et al. (2002) pointed out that some of the human sequences in the original comparative study differed more from each other than the Neanderthal sequence did from other humans in the sample. However, these purely phenetic comparisons based on distances alone and not the structure of a cladogram constructed from these sequences is meaningless given the strong reciprocal monophyly shown between all modern humans and the Neanderthal sequence (Disotell,1999).
The second mtDNA sequence generated was an additional 340 bases from the hypervariable region II (HVRII) from the control region of the same Feldhofer 1 individual (Krings et al.,1999). Combining the two control regions improved resolution of the tree. Coalescence dating again yielded a Neanderthal lineage that fell outside of all modern humans, joining the tree 465,000 years ago. However, as it was from the same individual, Nordborg's (1998) critique of insufficient data to exclude interbreeding was still held.
A year later, a second Neanderthal individual, the Mezmaiskaya infant from the northern Caucasus (Ovchinnikov et al.,2000), was sequenced. A 345-base pair region of HVRI was characterized and compared with the Feldhofer Neanderthal and nearly 6,000 modern humans. The Mezmaiskaya and Feldhofer sequences clustered closely together, again suggesting that they were authentic endogenous Neanderthal DNA and not modern contaminates. Their estimate of Neanderthal-modern human divergence was between 365,000 and 853,000 years, similar to that of the Feldhofer estimate. More interestingly, the two Neanderthals provided the first estimate of Neanderthal genetic diversity. The genetic sequences of these Neanderthals, separated by 2,500 km, diverged between 151,000 and 352,000 years ago. In comparison, the same model estimated that modern human diversity accumulated between 106,000 and 246,000 years ago (Ovchinnikov et al.,2000).
Shortly thereafter, a third Neanderthal HVRI and partial HVRII sequence was generated from an individual from Croatia, Vindija 75 (Krings et al.,2000). In this case, the use of a chemical N-phenacylthiazolium bromide dramatically improved the retrieval of endogenous Neanderthal DNA when compared with contaminating modern human DNA. Additional chemical treatments have continued to improve researchers' ability to extract endogenous aDNA (e.g., Hofreiter et al.,2001; Briggs et al.,2010).
These three Neanderthal sequences cluster together into a single clade to the exclusion of all modern humans. Notably, despite the fact that only three Neanderthal sequences were known at the time, even this small sample had a 50% probability of containing the deepest divergence among Neanderthals. The Neanderthals' overall diversity was less than that found within modern humans, suggesting a similar relatively recent bottleneck and population expansion (Krings et al.,2000).
In the 5 years between 2002 and 2007, a dozen Neanderthal HVRI sequences were recovered (Schmitz et al.,2002; Serre et al.,2004; Beauval et al.,2005; Caramelli et al.,2006; Lalueza-Fox et al.,2006; Orlando et al.,2006; Krause et al.,2007) (Table 2). The Neanderthal samples ranged from El Sidrón in Spain (Serre et al.,2004) to Okladnikov in Siberia (Krause et al.,2007). The oldest is the 100,000-year-old specimen from Sclandina Cave in Belgium (Orlando et al.,2006). All of these new Neanderthal sequences and the three original sequences clustered together in a single clade to the exclusion of modern Europeans and in fact all modern humans. The oldest sequence from Sclandina, which is at least 50,000 years older than the next oldest, was also the most divergent. This suggests that Neanderthals may have undergone a bottleneck sometime after 100,000 years ago reducing their diversity to well below that of recent modern humans.
The studies of Neanderthal mtDNA diversity through time and space have also yielded several insights into Neanderthal demography. Despite their relatively large geographic range, Neanderthals are likely to have had smaller effective population than modern humans, and one that is more on the order of that found amongst western Europeans or Asians (Krause et al., 2006; Green et al.,2008; Briggs et al.,2009). Furthermore, the oldest and eastern-most Neanderthals cluster together to the exclusion of western European Neanderthals younger than 48,000 years old. Fabre et al. (2009) and Dalén et al. (2012) suggested that the western populations experienced rapid decline and population replacement, whereas the eastern populations remained more stable. Whether this was due to competition or climatic change is unknown.
One of the first attempts to sequence aDNA from an early modern human fossil took place in 2001. Adcock et al. (2001) sequenced several Australian fossils, including Lake Mungo 3, and inferred that the deepest known human mtDNA lineage is found in Australia. Multiple problems with their sequencing strategy, their analysis, and their conclusions have been identified (Cooper et al.,2001). Few of the standard laboratory procedures used to sequence aDNA (Cooper and Poinar 2000, Hofreiter et al., 2001) were followed and the sequence was not verified by an independent laboratory. Adcock et al. (2001) concluded that the most closely related lineage to the Lake Mungo 3 specimen was in fact a mitochondrial pseudogene located on chromosome 11, suggesting possible modern contamination, and thus their sequence should not be compared with true mtDNA. Smith et al. (2003) examined the evidence for DNA preservation at various sites and concluded that the thermal conditions at Lake Mungo exceeded those conducive to aDNA survival. Furthermore, even without these problems, the conclusion provided by Adcock et al. (2001) does not in fact support the “multiregional model” according to Cooper et al. (2001) based on the structure of their inferred phylogenetic tree.
Given the difficulty of retrieving endogenous DNA from a fossil specimen, which is almost always contaminated with some level of modern human DNA, the sequencing of early modern humans is particularly problematic. To avoid modern contamination, guidelines were laid out by Cooper and Poinar (2000) and Hofreiter et al. (2001). These include a physically isolated work area, control amplifications, observing appropriate molecular behavior, reproducibility, cloning, independent replication, appropriate biochemical preservation, quantitation, and amplification of associated remains.
Using these guidelines, Caramelli et al. (2003) examined the HVRI sequence of two early modern human specimens from Italy, Pagalicci-25, and Pagalicci-12 that date between 23 and 24.7 ka. One specimen was identical to the Cambridge Reference Sequence, which is found in 14% of Europeans and Near Easterners, whereas the other differed by an average of 2.34 substitutions when compared with these same populations (Caramelli et al.,2003). On the other hand, the Pagalicci specimens differed by 22–28 substitutions to the then known Neanderthal sequences, and from this result, Caramelli et al. (2003) concluded that Neanderthals did not contribute to the modern European gene pool. Still, several research groups criticized this study claiming that it would be nearly impossible to rule out contamination of early modern human sequences by contemporary DNA (Pääbo et al.,2004; Serre et al.,2004).
With the increasing database of Neanderthal HVRI sequences, Serre et al. (2004) undertook a particularly interesting approach to detect endogenous early modern DNA. They reasoned that instead of trying to rule out contamination of early modern human DNA samples by contemporary modern humans, one could focus on Neanderthal markers using Neanderthal-specific PCR primers. In sampling four Neanderthal individuals (two from Vindija, Engis 2, and La Chapelle-aux-Saints), they found Neanderthal-specific substitutions. In contrast, five early modern samples (two from Mladec and one each from Cro-Magnon, Abri Patuad, and La Madeleine) did not amplify with Neanderthal-specific primers but did amplify with more general hominoid primers. Cave bears from the sites also yielded aDNA, suggesting that conditions were adequate for good preservation. Assuming that they could not trust early modern sequences and given the failure of the Neanderthal primers in samples that should have yielded aDNA, Serre et al. (2004) concluded that these early modern samples did not show evidence of hybridization with contemporary Neanderthals. Although not ruling out introgression due to the small sample size, this study nevertheless corroborated the growing view of little to no introgression.
Currat and Excoffier (2004) followed the analysis by Serre et al. (2004) with an extensive simulation study to investigate the conditions under which Neanderthal introgression could be detected in mtDNA. By modeling the expansion of modern humans into Europe along with competition and admixture with Neanderthals, they estimated a less than 0.1% interbreeding rate. According to their modeling, the absence of Neanderthal mtDNA in the European population, even after 12,000 years of overlap, would mean that fewer than 120 mating events took place over the entire range. An important component of this model is that the front edge of an expanding population, where interbreeding would most likely take place, acts like a wave carrying new mutations and introgressed alleles forward to higher and higher levels rather than losing them to drift. This phenomenon of iterative founder effects is often referred to as “surfing the wave.” Relethford (2001), however, contended that different models incorporating interregional gene flow could yield a situation in which a similar level of Neanderthal ancestry should be found in multiple populations, not just those in Europe and the Near East.
NEXT-GENERATION SEQUENCING OF aDNA
In the middle of the first decade of the 21st century, new high-throughput sequencing methods became feasible. Commercial development of these technologies has proceeded at a breathtaking pace (Mardis,2008; Millar et al.,2008). Multiple different processes and technologies are used; however, they all generally produce short read lengths (i.e., dozens to the low hundreds of bases) by sequencing hundreds of millions to billions of bases per run. These technologies sequence hundreds of thousands to millions of individual DNA templates simultaneously. Because they sequence all of the DNAs in a sample, including bacteria and other contaminants, the analysis of the resulting data is difficult. Fortunately, aDNA can be identified based on its special properties including small size and patterns of damage. Most of the techniques rely on a DNA library that is created by directly adding a short adapter sequence to both ends of the short fragments contained in the sample. The entire sample can then be amplified by using primers that match the adapters rather than the source DNA; this DNA library can then be sequenced or further amplified for future analyses.
Because of the short length of each individual read, it is important to have a closely related reference genome to help identify and place such short fragments into a partial or whole genome (Millar et al.,2008). One of the biggest advantages of these new techniques is that even with highly contaminated DNA or very small quantities, so much data are generated that any identifiable fragments are likely to be sequenced. One recent study found only 1.3% of the DNA sequenced was from the target; however, it was sufficient to generate thousands of bases of Neanderthal DNA (Noonan et al.,2006).
A premature attempt at characterizing nuclear aDNA applied Southern blot hybridization to hominin bones that could not be identified as either Neanderthal or recent human based on morphological criteria (Scholz et al.,2000). Using probes derived from either modern human or Neanderthal samples, they claimed to be able to identify unknown samples based on the intensity of the hybridization reaction. Unfortunately, Geigl (2001) clearly demonstrated that exogenous DNA from the soil (including bacteria and other organisms) overwhelmed any putative signal from the bone itself giving false positive results.
Five years later, nuclear aDNA was successfully retrieved from Pleistocene mammals, including Neanderthals. The first two successful studies characterizing nuclear aDNA included the generation of 27,000 base pairs from a Pleistocene cave bear (Noonan et al.,2005) and 13 million bases from a woolly mammoth (Poinar et al.,2006). These same research groups characterized more than one million bases from the Neanderthal genome in 2006. Both studies started from a 38,000-year-old specimen from Vindija, and the mtDNA analysis suggested that it contained 98% endogenous Neanderthal DNA and only 2% modern human contaminants (Noonan et al.,2006).
The first study directly cloned (without amplification) the DNA of Vindija specimen (Vi-80) to generate 62,250 bases of Neanderthal origin (Noonan et al.,2006). The study identified the Neanderthal origin of these sequences based on the signatures of damage to the DNA molecules, which suggested that they were of ancient origin. Their analysis inferred an average divergence time of 706,000 years ago for the various alleles and a population split between modern humans and Neanderthals of 370,000 years ago.
Average divergence times for different portions of the genome will almost always be older than the population split because most populations have variation within them, unless significant admixture occurs after populations split. One way to test for the date of divergence between populations is to look at SNPs within each population. If the split between modern humans and Neanderthals is old, only rarely should Neanderthals have the derived version of a modern human SNP (a variant present in some but not all modern populations). This is because if the variant appeared solely in the modern lineage and not in the shared ancestral lineage of modern humans and Neanderthals, then derived modern variants will not be found in Neanderthals. If Neanderthals and modern humans split recently or admixed significantly, then derived modern human SNPs should be common in the Neanderthal genome. Only three derived modern human SNP variants were found in the Neanderthal sample, and from this result, Noonan et al. (2006) concluded that little to no interbreeding had occurred.
In contrast, Green et al. (2006) created a DNA library from the same sample used by Noonan et al. (2006) followed by an emulsion bead-based amplification using the adapter primers. They generated more than million bases of putative Neanderthal DNA. However, contrary to Noonan et al. (2006), Green et al. (2006) found an average divergence time of 516,000 years ago and 30% of the SNPs were identical to the derived modern human type. From this result, they concluded that there must have been some level of admixture. However, Wall and Kim (2007) reanalyzed the data of Green et al. (2006) and noted that the longer sequence reads more closely resembled those of modern humans and more often shared derived SNPs than did their short reads. The dataset provided by Noonan et al. (2006) did not have this property. From this result, Wall and Kim (2007) concluded that the dataset of Green et al. (2006) had a significant amount of modern human contamination, now known to be up to 80%, most likely introduced at the commercial facility in which the final sequencing was carried out. Thus, by 2007, there was still little to no DNA evidence of Neanderthal admixture with modern humans (Hodgson and Disotell,2008).
In 2008, the first complete Neanderthal mtDNA genome, 16,565-base pairs long, was generated using high-throughput sequencing, from an individual from Vindija Cave in Croatia (Green et al.,2008). Almost immediately afterward, five complete Neanderthal mtDNA genomes were generated, including the Feldhofer 1 and 2, and Mezmaiskaya 1 individuals, an additional individual from Vindija, one from El Sidrón in Spain, and a partial mtDNA genome from Mezmaiskaya 2 (Briggs et al.,2009). These complete and partial mtDNA genomes were generated using a targeted approach called primer extension capture (PEC) in which specific portions of highly degraded genomes can be sequenced using high-throughput methods. The younger Neanderthals (38,000 to 70,000 years old) have about one-third of the genetic variation of modern humans today, suggesting a smaller effective population size. Not surprisingly, the easternmost and oldest sample, Mezmaiskaya 1 (60,000-70,000 ka) was the most divergent sequence. Interestingly, the more recent (42,000 ka) Mezmaiskaya 2 individual clusters within the clade of more recent western Neanderthals rather than with the older individual from the same site.
Despite earlier claims of the impossibility of accurately sequencing early modern fossils, high-throughput techniques also recovered the complete mtDNA genome of a ∼30,000-year-old early modern human from Kostenki, Russia (Krause et al.,2010a). This sequence carried five diagnostic substitutions that placed it in modern human haplogroup U2. Interestingly, its branch length from the inferred common ancestor of haplogroup U was very short as might be expected for an ancient sample. Today this haplogroup is found in North Africa, western Asia, and Europe, possibly suggesting some genetic continuity between early modern humans in Europe and present-day populations (Krause et al.,2010a).
By the end of 2009, a growing consensus based on the mtDNA of over two-dozen Neanderthal individuals found no evidence of female-mediated mtDNA gene flow between Neanderthals and modern humans (Currat and Excoffier,2004; Hodgson and Disotell,2008; Serre et al.,2004). As there is no evidence of Neanderthal mtDNA in the modern human gene pool, some authors have suggested that Neanderthal–human hybrids would have been rare with male hybrids being sterile (Mason and Short,2011). According to Haldane's rule, the heterogametic sex in interspecific hybrids (XY males in the case of mammals) will be absent, rare, or sterile (Short,1997). I have proposed that Neanderthals and modern humans may have had different diploid chromosome numbers (Disotell,2006). Given that the very short sequences generated from Neanderthals to date are aligned to the human genome, an accurate karyotype cannot be generated. This hypothesis, however, can be properly tested if very extensive coverage of the Neanderthal genome is generated in the future so that a de novo alignment can be created that does not require a human scaffold or with discovery of a frozen Neanderthal with intact cells that are able to by karyotyped.
ARCHAIC HUMAN GENOMES
Learning from their previous problems with modern human contamination in sequencing Neanderthal nuclear DNA, Green et al. (2010) successfully generated 1.3-fold coverage of the Neanderthal genome from three individual females (determined genetically) from Vindija cave, Croatia. More than 5.3 gigabases (Gb) of sequence was generated using two different high-throughput sequencing techniques after using methods to enrich the samples for Neanderthal-specific DNA and to reduce the microbial background that is always present in ancient samples. They aligned these sequences, many of which overlapped, to the human and chimpanzee genomes, and the human–chimp inferred common ancestral sequence. They also generated a geographically diverse sample of five contemporary human genomes for comparative purposes. They used three different methods to determine if modern human contamination was present. First, they determined that there was less than 0.5% mtDNA contamination. Second, as all three samples were females, the only Y chromosomal DNA present would be from modern contamination. They found ∼0.6% Y chromosomal contamination. Third, they compared the five modern humans with the chimpanzee to identify human-derived alleles. As many Neanderthal sequences overlapped, they were able to identify modern human-derived alleles that were present at sites in which at least two Neanderthals shared the ancestral allele found in the chimpanzee. The amount of modern contamination found using this technique was ∼0.7%. Overall, they were able to cover about 60% of the entire Neanderthal genome with less than 1% error.
Smaller amounts (0.1–0.2%) of additional genomic sequence were generated for El Sidrón, Feldhofer Cave, and Mezmaiskaya individuals. From these data, they estimated the human and Neanderthal populations split between 270,000 and 440,000 years ago. Even though all modern human mtDNA coalesces within the last 200,000 years, this population split falls within the range of the coalescence of many human nuclear genes (four times that of mtDNA), making it inevitable that Neanderthals and various humans will share some alleles.
To test for potential admixture between modern humans and Neanderthals, Green et al. (2010) carried out pairwise comparisons between each of their five modern genomes to look for an excess of shared derived polymorphisms with Neanderthals. Shared derived polymorphisms will indicate recently shared ancestry between the individuals carrying them. They found no excess of polymorphisms to their two African samples (San and Yoruba). Surprisingly, they found an excess of shared polymorphisms between the Neanderthal and all three non-African populations (French, Han Chinese, and Papua New Guinea). Most interestingly, the Chinese and Papuan samples were as similar to the Neanderthal as the French sample (Fig. 1).
To test if this admixture was recent or ancient, they looked for evidence of extended haplotypes (longer regions of the genome that are similar between two individuals). If there was no recent admixture, all of the regions of the genome should be relatively variable due to recombination since the two individuals last shared common ancestry. Instead, Green et al. (2010) found evidence of extended haplotypes shared by Neanderthals and non-Africans, suggesting that the admixture was relatively recent.
The estimate by Green et al. (2010) that ranged between 1 and 4% of non-African genomes are derived from admixture with Neanderthals is incongruent with both the multiregional (MRE) and the recent African replacement (RAO) models.
None of the multiregional models predicts equal levels of mixing between Neanderthals and all Eurasians. For instance, some have proposed that shared anatomical features between Neanderthals and early modern Europeans are evidence of gene flow (Wolpoff et al.,2001). If there was no relationship between Neanderthals and Europeans, then these anatomical shared features would be either convergently derived or primitive retentions. Otherwise, these features should be present in Asia and Oceania. On the other hand, recent replacement advocates interpret the fossil record as showing no anatomical evidence of admixture (Stringer and Andrews,1988).
Green et al. (2010) proposed two alternate scenarios of Neanderthal–human admixture to explain their finding of 1–4% introgression. As mentioned above, the time to develop modern human diversity at nuclear loci overlaps with the divergence of Neanderthals, if there was ancient substructure in the African population when the ancestors of Neanderthals colonized western Eurasia, then some African populations may be more closely related to Neanderthals. If such populations also gave rise to the modern human populations that migrated out of Africa, these migrants and their descendants would share alleles with Neanderthals to the exclusion of other Africans. As only two Africans were sampled in their study, this scenario could not be ruled out. Eriksson and Manica (2012) cautioned that analyses that do not take ancient African substructure into account are likely to infer introgression when none necessarily occurred. However, Yang et al. (2012) sampled a larger group of Africans and concluded that ancient African substructure is unlikely to explain the findings of Green et al. (2010).
The second scenario proposed and the one favored by Green et al. (2010) is that the admixture occurred shortly after moderns left Africa, presumably in western Eurasia before they spread further east and northwest. Thus, Neanderthal alleles would have been carried throughout all of Asia including Oceania as well as to Europe and would not be found in sub-Saharan Africa. Sankararaman et al. (2012) tested the scenarios of ancient substructure versus more recent introgression after modern humans migrated out of Africa by inferring the date of gene flow between Neanderthals and moderns. If the shared alleles are due to ancient substructure, gene flow would have to be as old as the oldest definitive Neanderthal fossils or at least 230,000 years ago. On the other hand, if gene flow occurred after the modern human exodus from Africa, the signal of gene flow would be less than 100,000 years old. In their analysis, they estimated that the last gene flow between Neanderthals and Europeans occurred between 37,000 and 80,000 years ago (Sankararaman et al., 2012).
Hodgson et al. (2010) proposed an alternative hypothesis that suggests the dynamics or early range expansion may explain the 1–4% contribution to the modern human genome. The range of modern humans and Neanderthals overlapped in western Eurasia when the African faunal zone extended into this region around 100,000 years ago. Hodgson et al. (2010) suggested that when the modern human range contracted back into Africa, some admixture may have occurred and these alleles may be at low frequency in East Africa. The subsequent migration of humans out of Africa could have carried these low-frequency alleles, which would be amplified and surf the wave to much higher frequencies via the iterated founder effect during their range expansion (Currat and Excoffier,2004, 2011). This model would be possible even if the migration from 50,000 to 60,000 years ago was through the Arabian peninsular route and not through the Sinai where they would have presumably met Neanderthals as proposed by Green et al. (2010).
The implications of 1–4% admixture are unclear as to the signature that they might have left on skeletal morphology. Currat and Excoffier (2011) updated their analysis of mtDNA (Currat and Excoffier,2004) to model what the implications of this amount of nuclear admixture were for actual interbreeding. Under a variety of demographic models, they found that a very low rate of interbreeding was likely and speculated that there may have been avoidance of interspecific matings or lower fitness among the hybrids. They estimated that over the entire period of overlap over the entire range of potential contact, as few as a few hundred mating events could lead to 1–4% introgression (Currat and Excoffier,2011). What is equally interesting is that despite this Eurasian Neanderthal introgression, depending on when and where these rare events occurred, different populations are likely to share different Neanderthal alleles (Wills,2011).
Only a month before Green et al. (2010) published the draft Neanderthal genome, a rather unexpected finding was published based on a 40,000-year-old fossil from Siberia. A juvenile distal phalange (genetically demonstrated to be female) of an indeterminate hominin species from Denisova Cave in the Altai Mountains of southern Siberia yielded a complete mtDNA genome (Krause et al.,2010b). Despite the relatively recent age of the sample and its proximity within 100 km of a known Neanderthal site, the sequence was equally distantly related to both modern humans and Neanderthals. The sample appeared to diverge prior to the Neanderthal modern human split around one million years ago. This date is too late to belong to a lineage of H. erectus and too early to belong to the common ancestor of modern humans and Neanderthals, molecularly estimated from mtDNA at 465,000 years ago. Fortunately, additional information about this morphologically nondiagnosable individual was soon forthcoming.
Several months later, the same distal phalanx yielded a total of 5.2 billion bases of sequence providing 1.9X coverage of the nuclear genome of a group named Denisovans, analogously to Neanderthals who were named after the site of their discovery (Reich et al.,2010). A molar that is more primitive than Neanderthals or early moderns was also discovered at Denisova and has nearly identical mtDNA to the phalange (Reich et al.,2010). Using two different chemical treatments and knowledge gained from the sequencing of the Neanderthal genome, the Denisovan nuclear genome is of higher quality and estimated to contain fewer errors than that of the Neanderthal. Unlike the mtDNA sequence, the Denisovan nuclear DNA more closely clusters with Neanderthals with an average sequence divergence of around 640,000 years when compared with around 804,000 years ago to contemporary Africans. The comparison to modern Africans is important because Eurasians contain 1–4% Neanderthal ancestry. As Denisovans and Neanderthals share a period of common ancestry, a comparison to all modern humans, including Eurasians, would skew the estimates.
The seeming discordance between the mtDNA and nuclear genome estimates of relatedness and divergence dates between modern humans and Neanderthals could be due to two possible explanations. One possibility is that the Denisovans hybridized with an as yet unknown hominin species that migrated out of Africa after H. erectus and before the Neanderthals and modern humans. The other possibility is that a variable population ancestral to modern humans and Neanderthals gave rise to this mtDNA type and it went extinct in both the modern human and Neanderthal lineages, a process known as incomplete lineage sorting. To date, neither of these hypotheses can be more strongly supported.
Unlike Neanderthals, who seem to have provided several percent of Eurasian ancestry, the first analysis found that Denisovan ancestry was found only in Melanesians (Reich et al.,2010). Melanesians were found to have up to 4.8% Denisovan ancestry with an additional component of Neanderthal ancestry, that is, up to 7.4% of their ancestry is from archaic hominins. Subsequent analyses analyzing a wide range of Asian populations found that Denisovan alleles are found in aboriginal Australians, near Oceanic, Polynesian, Fijian, and east Indonesian populations but not in South Asia or East Asia (Reich et al.,2011). The researchers interpret this to mean that the Denisovans ranged from Siberia all the way to Southeast Asia where the gene flow into these populations occurred.
Members of the same research group have more recently applied a new molecular technique to left over portions of the original Denisovan DNA extraction to generate a high-quality (31X) genome that covers more than 99% of the “mappable” genome (Meyer et al., in press). With this basically complete genome and the original conclusions of Denisovan introgression into Southeast Asia and near Oceania stands, additional insights into these people are possible. First, there is reduced X chromosome gene flow into modern humans, suggesting a greater amount of male-mediated gene flow. Second, the Denisovans likely had a diploid number of 46 chromosomes. As they are more closely related to Neanderthals than modern humans, my previous prediction that the reduction of the typical ape karyotype from 48 to 46 was a modern human autapomorphy is wrong (Disotell,2006). Finally, the Denisovans were not very genetically diverse, with only about 20% of the diversity found in modern African and around 26–33% found in Eurasians.
With a nearly complete genome, an estimate of the amount of “missing evolution” is feasible to date the specimen. By comparing the Denisovan genome to that of modern humans and a chimpanzee, the number of substitutions that would have occurred since the individual died and therefore have not occurred can be estimated. Applying the molecular clock to these “missing” substitutions yields an estimate of 74,000–82,000 years for this specimen in broad agreement with its archeological context. Such a technique may be applicable to other fossils for which genomes can be generated even if they do not have good age estimates (Meyer et al., in press).
The discovery of an archaic hominin lineage of almost unknown morphology is extraordinarily intriguing. Do Chinese archaic or early modern human specimens or the Flores specimens belong to this lineage? Is there indeed a third archaic lineage that left Africa present in Siberia and Southeast Asia? Clearly, additional Denisovan fossils are needed as well as aDNA from other Asian specimens. Equally important is more thorough sampling of modern human populations. By far, too many studies have suffered from inadequate sampling of living human diversity (Disotell,2000a). As Hammer et al. (2011) and Reich et al. (2011) demonstrated that if we have to understand archaic human diversity, we first have to understand modern human diversity.
IMPLICATIONS FOR ARCHAIC HOMININ AND HUMAN EVOLUTION FROM GENOMICS
Deciphering archaic hominin genomes has obvious and important implications for understanding the phylogenetic relationships amongst Homo. However, the genomes also provide important clues as to the biology of archaic and modern humans as well. These include windows into demography, soft tissue morphology and physiology, and uniquely human evolutionary changes.
Identifying uniquely human genetic changes
One of the benefits of having archaic genomes is the possibility of using these to identify uniquely human genetic changes. Genetic changes along the human lineage were identified relative to various hominoid genomes, including chimpanzee (Chimpanzee Sequencing and Analysis Consortium,2005), orangutan (Locke et al.,2011), gorilla (Scally et al.,2012), bonobo (Prüfer et al.,2012), and macaque (Rhesus Macaque Genome Sequencing and Analysis Consortium,2007). However, the most closely related of these species to humans, the chimpanzee, diverged over 6 million years ago from humans. Thus, an enormous amount of parallel or convergent evolution (homoplasy) as well as chimpanzee-specific changes could easily obscure the signal of human-specific adaptations. The more recent age (800 ka) of the common ancestor of human, Neanderthal, and Denisovan alleles reduces the chance of homoplasy (Crisci et al.,2011) (Fig. 2). As an example, Burbano et al. (2012) have used Neanderthal and Denisovan sequences to examine “human accelerated regions,” sequences that are very conserved in vertebrates but fast evolving in modern humans, to suggest which regions should receive priority for functional studies.
One of the first examples of the importance of having closely related hominin sequences to help infer human-specific changes was with the FOXP2 gene, a gene involved in language development. Despite the gene's extreme conservatism, Enard et al. (2002) found two amino acid changes in the human lineage when compared with human, chimp, gorilla, orang, rhesus, and mouse. Thus, this gene would seem to be under selection in the human lineage, with amino acid changes occurring within the last 120,000 years. These changes may have something to do with the development of modern human language abilities (Enard et al.,2002). However, Krause et al. (2007) found these same changes in the Neanderthal genome and suggested that the mutations were much older and shared between humans and Neanderthals. Coop et al. (2008) countered that the selective signature in humans was much younger. With the discovery of admixture between humans and Neanderthals, these two views may be compatible if these changes arose in humans and introgressed into Neanderthals.
Evans et al. (2005) suggested that an allele of a gene involved in the regulation of brain size, microcephalin (MCPH1), appeared around 37,000 years ago in modern humans and was under strong positive selection, sweeping to 70% frequency worldwide. Typing of a much larger worldwide sample for the SNPs defining the specific allele led to the conclusion that this derived allele actually appeared around 1.1 million years ago and introgressed into modern humans from an archaic population (possibly Neanderthals) around 37,000 years ago (Evans et al.,2006). However, even this more thorough sequence data contained just nine Africans and therefore did not come close to sampling human diversity, especially given that so much of it exists in sub-Saharan Africa, leaving their conclusion premature. The subsequent sequencing of the Neanderthal and Denisovans did not detect the so-called derived allele and therefore make it an unlikely candidate to be both ancient and to have introgressed.
Another example in which the archaic genomes helped to determine when human-specific adaptations occurred is in the genes encoding the FADS1 and FADS2 enzymes that are related to fatty acid metabolism (Ameur et al.,2012). In this case, the alleles that allow for increases in efficiency synthesizing essential long-chain fatty acids clearly arose after humans split from Neanderthals. The authors speculate that this new allele would be advantageous in environments with limited access to these key fatty acids.
Several immune system genes show evidence of having introgressed from archaic hominins into the human genome. Alleles from OAS1 and several HLA haplotypes show evidence of Neanderthal and/or Denisovan origin (Abi-Rached et al.,2011; Mendez et al., 2012). Such introgression makes sense given that these archaic lineages had been living in Eurasia for hundreds of thousands of years prior to the appearance of anatomically modern human migrants from Africa. The Neanderthals and Denisovans would have presumably adapted to the various Eurasian pathogens over the course of time. When immunologically naive populations of modern humans hybridized with them, however rarely, individuals who gained local alleles would have been at an advantage. Coupled with the iterative founder effects of an expanding modern population, these alleles would be carried to even higher frequencies (Currat and Excoffier,2011).
A catalog of nearly all of the uniquely modern human substitutions since we split from our archaic cousins is possible given the high-quality Denisovan genome now available. More than 100,000 single-nucleotide changes and 10,000 insertions and deletions have occurred that remarkably yield only 260 amino acid changes that differ between modern humans and Denisovans (Meyer et al., in press). In proteins that are highly conserved amongst the primates, only 23 differ in modern humans of which eight are involved in neurodevelopment. A particularly interesting protein that varies only in humans is EVC2, which when mutated causes a human disease that includes taurodontism (Meyer et al., 2012).
Archaic genes and morphology
Of great interest is what specific genes can tell us about Neanderthal morphology and potential behavior. An interesting case is that of the melanocortin 1 receptor (MC1R) gene for which a partial loss-of-function allele in humans leads to red hair and pale skin (Rees,2000). Lalueza-Fox et al. (2007) sequenced two Neanderthals, one from Monte Lessini, Italy, and one from El Sidrón, Spain, which were demonstrated by Noonan et al. (2006) and Green et al. (2006) to have intact nuclear DNA. Amplifying very short regions and taking many precautions to avoid contamination, including replicating their sequences in two additional laboratories, Lalueza-Fox et al. (2007) were able to find the same variant in both Neanderthals that led to an amino acid replacement. Although not being able to determine if they were homozygous at that particular site, the fact that two different individuals had the variant suggested that at least some Neanderthals would have been homozygous. When the researchers expressed a copy of the protein in a cell-based assay, the protein demonstrated the same degree of loss of function as the human allele that leads to red hair and pale skin, suggesting that at least some Neanderthals would also have that condition. Some researchers had previously speculated that Neanderthals may have red hair and pale skin because such a condition would be a selective advantage in a population in less UV light intensive regions of the world such as Europe (Jolly,2001). Such genes would thus be ideal candidates for introgression if Neanderthals hybridized with modern humans. However, the underlying mutation responsible for the loss of function in Neanderthals and Europeans with red hair and pale skin are completely independent of each other.
Other archaic hominins, including some Neanderthals, do not have this pale factor. The only one of the three Vindija specimens for which the MC1R sequence was generated did not contain the loss of function allele was found in the Italian and Spanish Neanderthals by Lalueza-Fox et al. (2007). Additionally, Cerqueira et al. (in press) examined 124 SNPs found in various whole genomes, including the Neanderthal and Denisovan drafts to infer different aspects of human pigmentation. Their analysis suggests that these archaic hominins had darker skin, red or brown hair, and brown eyes. The complete Denisovan genome reveals that females had dark skin and brown eyes and hair (Meyer et al., 2012).
Two male Neanderthals from Spain showed evidence for the deletion responsible for the O blood type from the ABO system (Lalueza-Fox et al.,2008). Another interesting allele found in Neanderthals is that of a micro-RNA (a type of posttranscriptional repressor), miR-1304, which regulates several genes involved in enamel formation. The ancestral version of miR-1304 is predicted to affect the expression of amelotin and enamelin. Lopez-Valenzuela et al. (2012) suggested that the difference in expression of these genes caused by miR-1304 may lead to the slower dentition timing and thicker enamel present in modern humans when compared with Neanderthals. As we understand more of the genome and its relationship to development, very exciting insights into both Neanderthals and ourselves will be forthcoming.
SUMMARY AND CONCLUSIONS
aDNA from archaic humans has contributed a great deal to our knowledge of modern humans and Neanderthals as well as identified an unknown lineage of archaics, the Denisovans. New insights into the demography of our archaic relatives as well as aspects of their soft tissue anatomy and physiology have been gleaned from their genomes. A better understanding of uniquely human genes is also now possible with relatives much closer to us than chimpanzees and bonobos is now possible.
Both multiregional models (MRE) and the recent replacement models (RAO) of human origins fall short in light of new data from the genomes of archaic hominins.
Although the original version of the MRE has not been seriously considered for decades, more realistic versions of the model have proposed that modern African migrants hybridized with local archaics in different regions of the Old World. Prior to the sequencing of the Neanderthal and Denisovan genomes, evidence from modern patterns of variation and from ancient mtDNA from archaics and early anatomically modern human samples provided no clear evidence of introgression. Genomic data demonstrated that the pattern of allele sharing between these different populations and the presence of extended shared haplotypes indicate that such introgression did occur. However, realistic models demonstrate that a very minimal level of hybridization is likely to have occurred, with the alleles surfing the wave to higher frequencies in the expanding modern population.
At the moment, the Eurasian-wide presence of Neanderthal alleles in modern populations suggests that either such introgression happened almost immediately on exiting Africa or perhaps in North or East Africa prior to migration. The Denisovan genome is even more intriguing. Almost nothing is known about these people morphologically, except for the skin, hair, and eye color. Could they be related to other Asian archaics, Flores, or yet another unknown lineage? Furthermore, it is highly likely that other archaic populations in Africa also contributed to the modern genome.
As we better understand the human genome and the links between it, morphology and behavior, we will be able to gain deeper insights into our not quite extinct relatives.
The author thanks Bob Sussman for allowing to write this review and for his patience to include the very latest developments. The author is particularly thankful for the numerous discussions with Shara Bailey, Susan Antón, Terry Harrison, Jason Hodgson, Luca Pozzi, Christina Bergey, and Andy Burrell that helped to tease apart many of the complex and sometimes confusing and contradictory interpretations and analyses that have been put forth over the years.