A Bidirectional Corridor in the Sahel-Sudan Belt and the Distinctive Features of the Chad Basin Populations: A History Revealed by the Mitochondrial DNA Genome

Authors

  • V. Černý,

    Corresponding author
    1. Department of Anthropology & Environment, Institute of Archaeology, Czech Academy of Sciences, 118 01 Prague 1, Czech Republic
    Search for more papers by this author
  • A. Salas,

    1. Unidad de Genética, Instituto de Medicina Legal, Facultad de Medicina, Universidad de Santiago de Compostela, 15782, and Centro Nacional de Genotipado (CeGen), Hospital Clínico Universitario, 15706, Galicia, Spain
    Search for more papers by this author
  • M. Hájek,

    1. Department of Anthropology & Environment, Institute of Archaeology, Czech Academy of Sciences, 118 01 Prague 1, Czech Republic
    Search for more papers by this author
  • M. Žaloudková,

    1. Department of Anthropology & Environment, Institute of Archaeology, Czech Academy of Sciences, 118 01 Prague 1, Czech Republic
    Search for more papers by this author
  • R. Brdička

    1. Accredited Laboratory for DNA Testing and Co-ordination Centre of Genetic Laboratories in the Czech Republic, Institute of Haematology and Blood Transfusion, 128 20 Prague 2, Czech Republic
    Search for more papers by this author

Corresponding author: Černý Viktor, Department of Anthropology & Environment, Institute of Archaeology, Czech Academy of Sciences, 118 01 Prague 1, Czech Republic, Fax: +420.257532288, Tel: +420.257014304, E-mail: cerny@arup.cas.cz

Summary

The Chad Basin was sparsely inhabited during the Stone Age, and its continual settlement began with the Holocene. The role played by Lake Chad in the history and migration patterns of Africa is still unclear. We studied the mitochondrial DNA (mtDNA) variability in 448 individuals from 12 ethnically and/or economically (agricultural/pastoral) different populations from Cameroon, Chad, Niger and Nigeria. The data indicate the importance of this region as a corridor connecting East and West Africa; however, this bidirectional flow of people in the Sahel-Sudan Belt did not erase features peculiar to the original Chad Basin populations. A new sub-clade, L3f2, is described, which together with L3e5 is most probably autochthonous in the Chad Basin. The phylogeography of these two sub-haplogroups seems to indicate prehistoric expansion events in the Chad Basin around 28,950 and 11,400 Y.B.P., respectively. The distribution of L3f2 is virtually restricted to the Chad Basin alone, and in particular to Chadic speaking populations, while L3e5 shows evidence for diffusion into North Africa at about 7,100 Y.B.P. The absence of L3f2 and L3e5 in African-Americans, and the limited number of L-haplotypes shared between the Chad Basin populations and African-Americans, indicate the low contribution of the Chad region to the Atlantic slave trade.

Introduction

Since ancient times Lake Chad has been somewhat isolated geographically and, while some researchers have considered this region a crossroads for human migrations, others regard it as a final destination where population movements from Western and Eastern Africa terminated (Lange, 1992; Cyffer, 2002). From an ethno-linguistic point of view the Chad Basin is the homeland of highly diversified groups: three of the four African linguistic families (Afro-Asiatic, Niger-Congo and Nilo-Saharan) overlap here. The middle part of the Sahel-Sudan belt (sometimes referred to as Central Sudan) has its own history of great Islamic empires, including the Kanem, Bornu, Bagirmi, Waddai and others (Insoll, 2003), but its prehistory is still not well understood (Newman, 1995). The natural conditions around Lake Chad have at all times been dictated by oscillating wet and dry periods, which alternated not only at an annual level but also over much longer intervals (Maley, 1981). It seems that during the kanémien period (20000 – 12000 B.C.) only habitable desert existed around Lake Chad. Around 9,000 years ago, however, a large part of the Chad Basin was already underwater. Lake Megachad rose at this time to a height of 325 m above sea level (a.s.l.), and flowed through the Bahr el-Ghazal into the Bodélé plains of northern Chad. Its southern shore is still visible in the dune belt running along the Maiduguri-Bama-Limani-Borgor line, almost as far as latitude 10° North. It is estimated that Lake Megachad covered an expanse of some 330,000 km2; its current extent, at an altitude of 282 m a.s.l. and covering a mere 20,000 km2, may thus be regarded as a relatively insignificant relict (Brunk & Gronenborn, 2004). With the gradual drying of the climate, however, vegetation patterns stabilised and the present ethnic composition formed. The oscillating withdrawal of Lake Megachad around 5,000 Y.B.P., and with it the growth of the Sahara, led to a certain isolation of the Chad Basin populations – traces of ancient Egyptian campaigns end in the Gilf Kebir region in the south-western tip of what is now Egypt. The Sahel-Sudan belt, between the Sahara to the North and the tropical rain forests to the South, was for thousands of years a broad corridor along which cultural influences, as well as human migrations from East and West Africa, moved.

Beef cattle, prehistoric depictions of which can be found in the mid-Saharan rocky massifs neighbouring the northern parts of the Chad Basin, were of prime importance among domesticated animals. The skeletal remains of domesticated cattle from the 4th millennium B.C. have been found at several sites in northern Niger, of which the most important was probably Adrar Bous (Haour, 2003). In the southern part of the Chad Basin the first evidence, in the form of domesticated cattle bones, comes however from just 3,000 Y.B.P. Plant cultivation, too, began relatively late in the Chad Basin. Pearl millet (Pennisetum glaucum) was the first agricultural plant to appear in the archaeological profiles from north-eastern Nigeria; this was brought to Lake Chad at the close of the second millennium B.C., probably from northern Niger which may have been one of the two West African centres of domestication. Only in the second phase, and in conjunction with the broad spread of previously established iron metallurgy, did the cultivation of sorghum (Sorghum bicolor) begin, most likely having been domesticated in the Nile Valley; the earliest finds from the Chad Basin, however, date to the first half of the first millennium A.D. The marked delay in the advance of agricultural technologies in comparison with the outside world may be explained by the unusually favourable natural conditions of the Early and Middle Holocene, which did not compel the local inhabitants to adopt physically demanding and tedious crop cultivation (Neuman, 2003). The relatively late establishment of cattle may be associated with the threat of sleeping sickness, against which the pastoralists of the southern reaches of the Chad Basin must still protect their herds today (Gifford-González, 2000). Fishing and the diverse food sources linked to areas of water – across which the local population was able to move very effectively – were very probably of great importance; this is attested by what is thus far the earliest wooden boat to be found in Africa, from the site of Dufuna in north-eastern Nigeria, which radiocarbon dating shows to be 8,000 years old (Breunig et al. 1996).

In recent years genetics has made a substantial contribution to our understanding of human migration patterns. The mitochondrial genome (mtDNA) in particular has played a central role in unravelling the past and present history of African populations. Sampling in Africa is, however, still very insufficient, and many regions and ethnic groups remain uncharacterised; this is especially true for less accessible areas such as eastern Chad or the Congo Basin.

L–type haplogroups account for most of the sub–Saharan African mtDNA variability (Bandelt et al. 2001; Beleza et al. 2005; Brehm et al. 2002; Chen et al. 1995, 2000; Kivisild et al. 2004; Pereira et al. 2001; Plaza et al. 2004; Rando et al. 1998; Rosa et al. 2004; Salas et al. 2002, 2004a, 2004b, 2005b; Torroni et al. 2001a, 2001b, 2006; Watson et al. 1996, 1997). Many sub–Saharan L (sub)haplogroups (and their phylogeographic information content) and coalescence times have recently been summarised (Salas et al. 2002), and further updated in more recent studies (e.g. Richards et al. 2004; Kivisild et al. 2004; Salas et al. 2004b; Torroni et al. 2006). It seems that the major diversifications originated in East Africa but that the Bantu expansion, with its homeland in West African or the western Central region, made a major contribution to the present distribution of sub-Saharan African lineages (Salas et al. 2002, 2004b; Richards et al. 2004; Plaza et al. 2004; Beleza et al. 2005). The main aim of this article is to contribute to the knowledge of the population history of a relatively isolated part of Africa. To that end, we present a large set of mtDNA sequences of human populations living in the immediate vicinity of Lake Chad (in the Chad Basin), and analyse their genetic relationships with particular attention to the neighbouring populations of the Sahel/Sudan bend of Africa.

Material and Methods

Criteria for Population Sampling

We analysed 448 individuals from 12 different populations, sampled around Lake Chad in the southern part of the Chad Basin in northern Cameroon, western Chad, south-eastern Niger and north-eastern Nigeria (Table 1). This selection was made to embrace as broadly as possible the ethnic composition, economic orientations and geographic position of that area (Figure 1). Buccal swabs were colleted from maternally unrelated volunteers, all of whom gave informed consent.

Table 1.  Diversity indices of HVS-I mtDNA in the population samples from the Chad Basin
PopulationsGeographical RegionLanguage branch/Language FamilyEconomical/Cultural regimeNkShπM
  1. NOTE: *The Hide and the Mafa correspond to those individuals analysed in Černýet al. (2004); the Masa sample includes all the individuals from Černýet al. (2004; N = 31) plus 1, while the Kotoko also incorporates 38 additional DNAs with respect to the dataset reported by Černýet al. (2004; N = 18). **The Borgor Fulani and Tcheboua Fulani were previously reported in Černýet al. (2006).

  2. N = sample size; k = number of different sequences; S = number of segregating sites; h = haplotype diversity; π= nucleotide diversity; M = observed average number of pairwise differences. AA = Afro-Asiatic; NC = Niger-Congo; NS = Nilo-Saharan.

Hide*Northern CameroonChadic/AAAgriculturalist2322460.996 ± 0.0140.0256 ± 0.01378.7 ± 4.2
Kotoko*Northern CameroonChadic/AAAgriculturalist5631520.955 ± 0.0170.0211 ± 0.01117.2 ± 3.4
Mafa*Northern CameroonChadic/AAAgriculturalist3223440.980 ± 0.0120.0222 ± 0.01187.6 ± 3.6
Masa*Northern CameroonChadic/AAAgriculturalist3227380.988 ± 0.0120.0216 ± 0.01157.4 ± 3.5
BudumaSouth-eastern NigerChadic/AAAgriculturalist3022420.968 ± 0.0210.0219 ± 0.01177.5 ± 3.6
Chad ArabsWestern ChadSemitic/AANomadic2720350.963 ± 0.0230.0197 ± 0.01076.7 ± 3.3
Shuwa ArabsNorth-eastern NigeriaSemitic/AASemi-nomadic3827430.977 ± 0.0120.0177 ± 0.00966.0 ± 2.9
FaliNorth CameroonAdamawa-Ubangui/NCAgriculturalist4023430.947 ± 0.0190.0216 ± 0.01157.4 ± 3.5
Borgor Fulani**Western ChadAtlantic/NCNomadic4926330.931 ± 0.0240.0197 ± 0.01056.7 ± 3.3
Tcheboua Fulani**Central CameroonAtlantic/NCNomadic4021400.953 ± 0.0160.0207 ± 0.01107.1 ± 3.4
KanembuNorth-western ChadSaharan/NSAgriculturalist5037580.988 ± 0.0060.0258 ± 0.01348.8 ± 4.1
KanuriNorth-eastern NigeriaSaharan/NSAgriculturalist3127500.989 ± 0.0120.0224 ± 0.01207.6 ± 3.7
Figure 1.

Map showing the location of the samples analysed in the present work.

One group of five Chadic-speaking populations and one group of two Semitic-speaking populations were selected from the Afro-Asiatic language family. The first group of Chadic speaking populations comprises peasant populations from northern Cameroon and south-eastern Niger – the Hide and Mafa of the Mandara Mountains (the same individuals analysed in Černýet al. 2004; N= 23 and N= 32, respectively), the Kotoko of the Shari basin (those individuals analysed in Černýet al. 2004[N= 18] plus 38 new subjects), the Masa of the Logon basin (the individuals analysed in Černýet al. 2004[N= 31] plus one additional subject) and the Buduma of the north-western shore and islands of Lake Chad in Niger (N= 30). The Semitic group comprised two Arabic-speaking populations – the first made up of nomadic tribes migrating in Kanem and Bagirmi in Chad (N= 27), and the second composed of semi-nomadic Shuwa Arabs from the Borno state in Nigeria (N= 38).

From the Niger-Congo phylum one peasant Fali population from the Tinguelin rocky massif approximately 30 km North of Garoua in Cameroon (N= 40), and two Fulani populations were selected – the first nomadic Fulani sample was taken from the middle Logon South of Borgor in Chad (N= 49), and the second from the Tcheboua region, around 30 km South of the Benue River in Cameroon (N= 40). The latter sample is made up of nomads that have settled recently (approximately one or two generations ago, as reported by their leaders), but whose dependence on cattle rearing is still high.

The Kanembu from Kanem, northeast of Lake Chad in Chad (N= 50), and the Kanuri from the Borno state in Nigeria, southwest of the lake, (N= 31) were sampled from the Nilo-Saharan phylum.

Laboratory Methods

DNA extraction was performed using the method presented in Černýet al. (2004). HVS-I was amplified by means of primers F–15971 (5′–TTA ACT CCA CCA TTA GCA CC–3′) and R–16410 (5′–GAG GAT GGT GGT CAA GGG AC–3′), with an annealing temperature of 51°C. Purification was undertaken using the QIAquick PCR purification kit (QIAGEN). Reactions were carried out using the BigDye Terminator v.3.1 Cycle sequencing kit (Applied Biosystems). The sequence range of 16030–16370 bases was considered.

MtDNA variability in sub-Saharan Africa is characterised primarily by L-type haplogroups. Because of their high internal diversities, most of them—and even their sub-clades—can be recognised from the first hypervariable segment (HVS-I); some, however, require more information concerning coding region variants. RFLP analyses were thus undertaken for all the samples for HpaI 3592 (targeting position C3594T that determines L0′1′2′5′6 sequences, sensuTorroni et al. 2006) and MboI 2349 (transition T2352C determines the L3e branch). Variation at position 16390 (diagnostic of haplogroup L2) was recorded from chromatograms in most of the samples; in some cases however it was necessary to carry out RFLP detection using AvaII. The primer sequences and temperature profiles of the RFLP analyses are available on request from the corresponding author. Finally, some non-L-type mtDNAs were further RFLP genotyped using AluI 7025 (targeting site 7028) and MseI 14766 (for site 14766).

Nomenclature

The 16090 to 16365 sequence range was used for cross-comparisons of haplotypes in populations. It should be noted that the African nomenclature is in need of revision due to the new data available (especially concerning complete genomes). We here show the updated criteria for nomenclature that concerns the lineages observed and discussed in the present article and is based on complete genome data (available in the public literature and/or GenBank [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Nucleotide]).

L1b1 and L1c (and sub-clades) follow Salas et al. (2002). It is now know that L1e (Salas et al. 2002) occupies an intermediate branch between L1 and L2, so that it is referred to now as L5 (Kivisild et al. 2004; Torroni et al. 2006). L0a1 and L0a1a refer to the old L1a1 and L1a1 in Salas et al. (2002) (see the complete genome data from Mishmar et al. 2003).

There are only four complete genomes available (Ingman et al. 2000; Mishmar et al. 2003) as representatives of haplogroup L2a (although there are a lot of coding region segments); we again follow the nomenclature used as in Salas et al. (2002) since it is still consistent with this data. L2b, L2b1, L2c, L2c1, and L2d2 follow Salas et al. (2002).

L3h was first identified by Rosa et al. (2004); according to these authors L3h is characterized by a −1715DdeI site (which identifies the transition G1719A) and by the HVS-II variants G16129A C16256A T16362C. We now know that this variation probably identifies a sub-lineage of L3h characterized by coding region sites G1719A A4388G C5300T T9509C A11590G and by (at least) the control region mutations C16256A A16284G. We here rename this sub-cluster as L3h1 based on the data provided in Kivisild et al. (2004) and Torroni et al. (2006). Further, we observe that the basal motif of L3h should also include variant T195C. L3f and L3f1 are named as in Salas et al. (2002); there is however some evidence that transition C16292T that identifies L3f1 is just a sub-branch of a L3f sub-lineage defined by coding region sites C5601T T9950C (further subdivided by A14769G; Kivisild et al. 2004, and other coding region variants, see also Torroni et al. 2006). L3f2 is defined here as the branch carrying C16176T and C16234T on top of the L3f motif; there is one complete genome in Torroni et al. (2006) representative of this sub-cluster (see below). There is not control region variant that identifies L3e (Bandelt et al. 2001; Salas et al. 2002); nevertheless L3e1, L3e1b, L3e2 and L3e2b are defined by their HVS-I diagnostic sites as in Salas et al. 2002); at least for L3e2 and L3e2b the nomenclature is fully consistent with the complete genome data (e.g. Torroni et al. 2006). There is only one complete genome representative of L3e3; therefore, the nomenclature remains as in Salas et al. (2002). L3e5 has recently been defined by Torroni et al. (2006), based on only one complete genome (their #13 sample); here we observe that the HVS-I motif of L3e5 is constituted by transitions A16041G C16223T. Also according to our data, the extra HVS-II variant A16037G seems to define a small sub-branch, named here as L3e5a (also characterizing sample #13 of Torroni et al. 2006). L3d and its main sub-branches (L3d1 and L3d2), as well as L3b and L3b1, are named as in Salas et al. (2002); the nomenclature is still consistent with the scarce complete genome data available. L3g is the branch defined by T16086C C16223T A16293T T16311C C16355T T16362C in HVS-I (Salas et al. 2002, 2004a,b); here we move to L4 since it constitutes a branch between L3 and L7 (Torroni et al. 2006; note that this branch was also renamed by Kivisild et al. (2004) as L4g); L4 sub-branches follow the Salas et al. (2004b) phylogenetic scheme. Finally, nomenclature for U6 and M1 follows the nomenclature of the recent article by Olivieri et al. (2006; see also references therein), while U5 follows Achilli et al. (2005). For the rest of the European lineages we use the most recently updated nomenclature from Palanichamy et al. (2004).

Statistical and Phylogenetic Analyses

Median networks of HVS-I sequences were drawn by hand using the principles of the median-joining algorithm (Bandelt et al. 1999). Subsequently, the most parsimonious tree of haplogroups was inferred. Coalescent times were calculated using the ρ (rho) statistic, and an HVS-I mutation rate of one transition per 20,180 years was applied for the sequence range 16090–16365 using Network 4.1.1.2 software (Bandelt et al. 1995; Forster et al. 1996; Saillard et al. 2000). The diversity indices of the HVS-I sequences (haplotype diversity, nucleotide diversity, and average number of pairwise differences) were calculated using Arlequin 3.0 software (Excoffier et al. 2005). AMOVA (Excoffier et al. 1992) was analyzed using Arlequin 3.0, and the significance of the covariance components associated with the different levels of genetic structure was tested using a non-parametric permutation procedure (Excoffier et al. 1992). Principal Component Analysis was performed based on haplogroup frequencies as in Salas et al. (2005b). Comparisons between populations were assessed by FST distances, which were subsequently plotted by multidimensional scaling analysis (MDS) using the PROXSCAL technique, implemented in the SPSS 10.0 statistical package. An HVS-I mtDNA database of African populations and African-Americans (>6,600 mtDNAs) was employed for population comparisons; more details concerning these data can be found in Salas et al. (2005b). Note that the Bamileke and Ewondo in Destro-Bisol et al. (2004a), as well as the Bakara, Basa and Fulbe in Destro-Bisol et al. (2004b), are included in the dataset of Coia et al. (2005). The classification of samples in the main African regions is taken from previous works (Salas et al. 2002, 2004a, 2005b); the allocation of some population samples to, for example, the western Central or western African pool is based on pragmatic reasons; the analysis carried out in the present project and the conclusions drawn are not substantially dependent upon this classification. Variation at positions 16182–16185 and length polymorphism at the polyC were not considered. A posteriori (post-sequencing) phylogeographic checking of the mtDNA sequence data was carried out, in order to avoid data errors as far as possible (e.g. Bandelt et al. 2004a,b, 2005a,b; Salas et al. 2005a,e, 2006; Yao et al. 2006).

Results

Descriptive Parameters of the HVS-I Sequences

Table 1 shows the descriptive parameters of the Chad Basin populations analysed. It is interesting to note that, in a broad sense, the values for the different diversity parameters (haplotype and nucleotide diversities, average number of pairwise differences) were lower for the nomadic groups than for their agricultural counterparts (independently of geographic location); these differences are not, however, statistically significant.

Patterns of Matching Sequences Between Chad Basin Populations and the Main African Regions

A large database of African types was used for cross-comparison with the Chad Basin mtDNAs. As shown in Table 2, the number of shared individual mtDNAs and haplotypes is higher with western Central Africa than with any other African region, followed by West Africa. The difference between western Central and West Africa is more evident when looking at the matched haplotypes: Chad populations share 59 haplotypes (∼29%) with West African populations, accounting for ∼13% of the total haplotypes in West Africa. The percentage of shared haplotypes between Chad and western Central Africa is significantly higher (N= 101; ∼50%) accounting for ∼26% of the total haplotypes in western Central Africa.

Table 2.  Shared mtDNAs and haplotypes between Chad populations and different regions in Africa.
AFRICAN REGIONaCHAD BASIN POPULATIONS
INDIVIDUALSb (N= 448)aHAPLOTYPESc (NH= 203)a
  1. Notes: aN = sample size; NH= number of different haplotypes

  2. bThe first number before the hyphen is the number of individuals in the given African region sharing mtDNA with Chad Basin individuals; after the hyphen is the opposite, i.e. the number of individuals in the Chad Basin that share mtDNA with individuals from the given African region. Numbers in parentheses are the corresponding percentages with respect to the totals (in Africa before the hyphen and in the Chad Basin after the hyphen). All these numbers are computed for the shared mtDNA sequence segment between all the samples, i.e. from position 16090 to 16365.

  3. cFor the number of haplotypes shared between the Chad Basin and different African regions, the numbers in parentheses are firstly the percentage of these shared haplotypes from the total number of haplotypes (NH) in the given African region, and then the percentage from the total number of haplotypes in the Chad Basin.

EAST (N= 717; NH= 401)141–158 (19.7–35.3)32 (8.0–15.8)
NORTH (N= 1341; NH= 516)161–171 (12.0–38.2)40 (7.8–19.7)
SOUTH (N= 266; NH= 138)17–43 (6.4–9.6)9 (3.5–4.4)
SOUTH-EAST (N= 416; NH= 143)142–131 (34.1–29.2)28 (19.6–13.8)
SOUTH-WEST (N= 200; NH= 111)54–76 (27.0–17.0)21 (19.0–10.3)
WEST (N= 1228; NH= 452)529–249 (43.1–55.6)59 (13.0–29.1)
WESTERN CENTRAL (N= 999; NH= 379)444–319 (44.4–71.2)101 (26.6–49.8)

The percentages of shared mtDNA and haplotypes with North, East, and especially South Africa, are low. As inferred from phylogeographic information (see next section) the Chad Basin haplotypes matching those from South and south-eastern Africa coincide mainly with those probably ‘moved’ from West Africa during the Bantu expansion. The presence of East African mtDNAs in the Chad Basin seems to mirror the existence of a historical, bi-directional flow between East and West Africa. As a whole, the Chad Basin manifests a clear predominance of a western Central African component.

The Phylogeography of the Chad Basin

Analysis of the HVS-I region, as well as of three additional coding region mutations (HpaI 3592, MboI 2349, AvaII 16390), in all the samples made possible a reasonable phylogenetic classification of the L–type sub–Saharan haplotypes into the already defined haplogroups (Table S1). Figure 2 shows the patterns of haplogroup frequencies in the main African regions. As expected, most of the Chad Basin mtDNAs could be attributed to L-haplogroups; a non-negligible 5–6%, however, are of West Eurasian origin. This West Eurasian component (e.g. pre-HV, members of haplogroup U, etc.) is more prevalent in the Semitic nomadic group of the Afro-Asiatic phylum, represented mainly by the Arabic tribes from Chad that account for the five (pre-HV) sequences detected, and to a lesser extent by the semi-nomadic Shuwa Arabs of Nigeria as well. Some portion of these sub-Saharan African “intrusive” haplogroups was also detected within the nomadic Niger-Congo Fulani; all the U5 sequences were found in the Borgor Fulani. Two representatives of the North African autochthonous haplogroup U6 were detected in the Kanuri and in the Mafa.

Table S1.  HVS-I sequences and coding region RFLPs in 448 individuals from 12 different populations of the Chad Basin
nopubPopulationSample IDHVS-I motiv3592 Hpa I2349 Mbo I7025 Alu I14766 Mse IHaplogroup
  1. pub *An asterisk indicates the individuals analyzed in: Černý V, Hájek M, Čmejla R, Brůžek J, Brdička R (2004) MtDNA sequences of Chadic-speaking populations from northern Cameroon suggest their affinities with eastern Africa. Ann Hum Biol 31:554–569

  2. pub **Two asterisks indicate the individuals analyzed in: Černý V, Hájek M, Bromová M, Čmejla R, Diallo I, Brdička R (2006). The mtDNA of Fulani nomads and their genetic relationships to neighbouring sedentary populations. Hum Biol 78: 9–27.

  3. Note the unexpected status of 3249 MboI in sample #83 (haplogroup M1) and sample #445 (haplogroup H1). We have however found some back mutation in our database of complete or semi-complete (coding region) genomes, namely, Herrnstadt et al. (2002) within haplogroup H1 (sample #85), Palanichamy et al. (2004) within haplogroup K1a1b (sample #C40), and Achilli et al. (2005) within haplogroup U6b1 (sample #39).

1 Arabs Chad80126 189 362 (pre-HV)1
2 Arabs Chad81126 189 362 (pre-HV)1
3 Arabs Chad82126 189 362 (pre-HV)1
4 Arabs Chad91126 189 362 (pre-HV)1
5 Arabs Chad83126 189 362 364 (pre-HV)1
6 Buduma82CRS+H
7**Fulani Tcheboua122CRSH
8*Hide45CRSH
9 Kanuri19CRSH
10*Mafa21093 129 148 168 172 187 188G 189 223 230 311 320+ L0a1
11 Arabs Chad129129 148 168 172 187 188G 189 223 230 256 311 320+ L0a1
12 Kanembu97129 148 168 172 187 188G 189 223 230 256 311 320+ L0a1
13 Kanembu103129 148 168 172 187 188G 189 223 230 256 311 320+ L0a1
14*Masa75129 148 168 172 187 188G 189 223 230 256 311 320+ L0a1
15 Arabs Chad78129 148 168 172 187 188G 189 223 230 311 320+ L0a1
16 Arabs Chad93129 148 168 172 187 188G 189 223 230 311 320+ L0a1
17 Fali120129 148 168 172 187 188G 189 223 230 311 320+ L0a1
18 Kanembu72129 148 168 172 187 188G 189 223 230 311 320+ L0a1
19 Kotoko19129 148 168 172 187 188G 189 223 230 311 320+ L0a1
20*Kotoko66129 148 168 172 187 188G 189 223 230 311 320+ L0a1
21*Mafa49129 148 168 172 187 188G 189 223 230 311 320+ L0a1
22*Mafa7129 148 168 172 187 188G 189 223 230 311 320+ L0a1
23*Hide19093 129 147A 148 168 172 187 188G 189 209 223 230 278 293 311 320+ L0a1a
24*Masa90093 129 148 168 172 187 188G 189 193 223 230 278 293 311 320+ L0a1a
25 Buduma70093 129 148 168 172 187 188G 189 209 223 230 278 293 311 320+ L0a1a
26*Hide42093 129 148 168 172 187 188G 189 209 223 230 278 293 311 320+ L0a1a
27 Kanuri14093 129 148 168 172 187 188G 189 209 223 230 278 293 311 320+ L0a1a
28 Kotoko52093 129 148 168 172 187 188G 189 209 223 230 278 293 311 320+ L0a1a
29 Kotoko59093 129 148 168 172 187 188G 189 209 223 230 278 293 311 320+ L0a1a
30*Masa86093 129 148 168 172 187 188G 189 209 223 230 278 293 311 320+ L0a1a
31 Kanuri52129 148 168 172 184 187 188G 223 230 278 293 311 320+ L0a1a
32 Fali95129 148 168 172 187 188G 189 209 223 230 278 293 311 320+ L0a1a
33 Kotoko27129 148 168 172 187 188G 189 209 223 230 278 293 311 320+ L0a1a
34 Fali89129 148 168 172 187 188G 189 223 230 278 293 311 320+ L0a1a
35*Kotoko55129 148 168 172 187 188G 189 223 230 278 293 311 320+ L0a1a
36*Mafa18129 148 168 172 187 188G 189 223 230 278 293 311 320+ L0a1a
37 Fali64129 148 168 172 187 188G 189 223 230 278 311 320+ L0a1a
38**Fulani Borgor6069 093 126 187 189 223 264 270 278 293 311++ L1b1
39**Fulani Borgor16093 126 187 189 223 264 270 278 293 311++ L1b1
40**Fulani Borgor29093 126 187 189 223 264 270 278 293 311++ L1b1
41**Fulani Borgor48093 126 187 189 223 264 270 278 293 311++ L1b1
42**Fulani Tcheboua118093 126 187 189 223 264 270 278 293 311++ L1b1
43**Fulani Tcheboua136093 126 187 189 223 264 270 278 293 311++ L1b1
44 Fali113126 172 187 189 223 264 270 278 293 311++ L1b1
45 Kanuri56126 172 187 189 223 264 270 278 293 311++ L1b1
46**Fulani Borgor44126 184 187 189 223 260 264 270 278 293 311++ L1b1
47 Kanembu98126 184 187 189 223 260 264 270 278 293 311++ L1b1
48 Kanembu109126 184 187 189 223 260 264 270 278 293 311++ L1b1
49 Kanembu120126 184 187 189 223 260 264 270 278 293 311++ L1b1
50**Fulani Borgor36 126 187 189 213 223 260 264 270 278 293 311++ L1b1
51**Fulani Borgor46 126 187 189 213 223 260 264 270 278 293 311++ L1b1
52**Fulani Tcheboua115 126 187 189 213 223 260 264 270 278 293 311++ L1b1
53**Fulani Tcheboua119 126 187 189 213 223 260 264 270 278 293 311++ L1b1
54*Mafa28 126 187 189 223 245 264 270 278 293 301 311++ L1b1
55 Kanembu99126 187 189 223 264 270 278 289 293 311++ L1b1
56 Arabs Shuwa184126 187 189 223 264 270 278 293 311++ L1b1
57 Fali6126 187 189 223 264 270 278 293 311++ L1b1
58 Fali40126 187 189 223 264 270 278 293 311++ L1b1
59 Fali75126 187 189 223 264 270 278 293 311++ L1b1
60**Fulani Borgor1 126 187 189 223 264 270 278 293 311++ L1b1
61**Fulani Borgor13 126 187 189 223 264 270 278 293 311++ L1b1
62**Fulani Borgor15 126 187 189 223 264 270 278 293 311++ L1b1
63**Fulani Borgor39 126 187 189 223 264 270 278 293 311++ L1b1
64**Fulani Borgor49 126 187 189 223 264 270 278 293 311++ L1b1
65**Fulani Tcheboua113 126 187 189 223 264 270 278 293 311++ L1b1
66**Fulani Tcheboua131 126 187 189 223 264 270 278 293 311++ L1b1
67**Fulani Tcheboua138 126 187 189 223 264 270 278 293 311++ L1b1
68**Fulani Tcheboua141 126 187 189 223 264 270 278 293 311++ L1b1
69 Kanembu55126 187 189 223 264 270 278 293 311++ L1b1
70 Kanembu114126 187 189 223 264 270 278 293 311++ L1b1
71*Kotoko48 126 187 189 223 264 270 278 293 311++ L1b1
72**Fulani Borgor10 126 187 189 223 270 278 293 311++ L1b1
73**Fulani Tcheboua125 038 129 187 189 223 256 278 284 293 294 311 360+ L1c1
74 Kanembu107129 148 187 189 192 223 255 278 293 294 311 360+ L1c1
75 Kanembu110129 148 187 189 192 223 255 278 293 294 311 360+ L1c1
76*Hide36 129 163 187 189 209 223 278 293 294 311 360+ L1c1
77*Masa99 129 163 187 189 209 223 278 293 294 311 360+ L1c1
78*Hide33 129 163 187 189 278 293 294 311 360+ L1c1
79*Hide18 093 129 172 189 215 223 278 294 311 360+ L1c3
80**Fulani Tcheboua146 129 180 182C 183C 189 215 223 278 294 311 355 360+ L1c3
81 Fali84129 189 213 215 223 278 291 294 311 355del 360 362 364+ L1c3
82*Masa100 129 189 215 223 278 294 311 360+ L1c3
83 Kotoko20067 189 223 229 278 291 294 311 390+ L2a
84 Kotoko46067 189 223 229 278 291 294 311 390+ L2a
85*Masa77 067 189 223 229 278 291 294 311 390+ L2a
86*Masa98 067 189 223 229 278 291 294 311 390+ L2a
87 Kanuri28093 209 223 278 294 301 354 390+ L2a
88 Kanembu56093 209 223 278 294 390+ L2a
89 Kanembu57093 209 223 278 294 390+ L2a
90 Arabs Chad90131 223 278 294 355 390+ L2a
91**Fulani Borgor31 136 189 223 229 270 278 291 294 311 390+ L2a
92 Arabs Shuwa88145 169 213 223 278 294 390+ L2a
93 Arabs Chad79145 189 213 223 278 294 390+ L2a
94 Arabs Shuwa90145 213 223 274 278 294 390+ L2a
95*Masa76 145 213 223 278 294 390+ L2a
96*Masa103 150 189 223 229 278 291 311 390+ L2a
97*Mafa50 169 223 278 294 311 355 390+ L2a
98 Kanembu116175 209 223 278 294 301 354 390+ L2a
99**Fulani Borgor14 183C 189 223 278 294 390+ L2a
100 Kanembu100188 189 223 229 278 291 294 311 390+ L2a
101 Arabs Chad131189 192 223 278 294 311 365 390+ L2a
102 Buduma67189 192 223 278 294 360 390+ L2a
103*Hide28 189 192 223 278 294 390+ L2a
104 Kanembu59189 209 223 271 278 294 301 354 390+ L2a
105 Kanembu63189 209 223 271 278 294 301 354 390+ L2a
106 Fali83189 223 229 278 291 294 311 320 364 390+ L2a
107 Fali81189 223 229 278 291 294 311 365 390+ L2a
108 Arabs Shuwa163189 223 229 278 291 294 311 390+ L2a
109*Masa88 189 223 229 278 291 294 311 390+ L2a
110 Kanembu106189 223 266 278 294 355 358 390+ L2a
111 Arabs Chad133189 223 266 278 294 355 390+ L2a
112 Kanembu102189 223 266 278 294 355 390+ L2a
113 Kanembu119189 223 266 278 294 355 390+ L2a
114 Arabs Chad134189 223 278 293 294 390+ L2a
115**Fulani Tcheboua143 192 223 278 311 390+ L2a
116**Fulani Borgor7 192 223 278 390+ L2a
117 Fali91209 223 278 294 301 311 354 390+ L2a
118 Arabs Chad149209 223 278 294 301 354 390+ L2a
119 Arabs Shuwa83209 223 278 294 301 354 390+ L2a
120 Arabs Shuwa85209 223 278 294 301 354 390+ L2a
121 Arabs Shuwa171209 223 278 294 301 354 390+ L2a
122 Kanembu65209 223 278 294 301 354 390+ L2a
123 Kotoko18209 223 278 294 301 354 390+ L2a
124*Kotoko43 209 223 278 294 301 354 390+ L2a
125 Arabs Shuwa175223 278 294 390+ L2a
126*Kotoko21 223 278 294 390+ L2a
127 Kotoko26223 278 294 390+ L2a
128*Masa105 223 278 294 390+ L2a
129 Kanembu105113C 223 278 294 309 390+ L2a1
130*Hide43 126 189 192 223 278 294 309 390+ L2a1
131 Kanembu68126 189 223 278 292 294 309 390+ L2a1
132*Mafa12 129 223 278 294 309 390+ L2a1
133*Mafa56 129 223 278 294 309 390+ L2a1
134 Fali37174 189 223 278 294 309 365 390+ L2a1
135 Arabs Chad85176 189 223 278 294 309 390+ L2a1
136 Kotoko5181 223 278 294 309 390+ L2a1
137 Kotoko13181 223 278 294 309 390+ L2a1
138 Kotoko53183C 189 223 245 278 294 309 390+ L2a1
139 Kanembu61187 189 192 223 278 294 309 390+ L2a1
140 Kanembu51189 192 221 223 278 294 309 390+ L2a1
141 Buduma77189 192 223 278 294 309 390+ L2a1
142 Buduma84189 192 223 278 294 309 390+ L2a1
143**Fulani Borgor30 189 192 223 278 294 309 390+ L2a1
144 Kanuri20189 192 223 278 294 309 390+ L2a1
145 Kanuri25189 192 223 278 294 309 390+ L2a1
146 Kanuri32189 192 223 278 294 309 390+ L2a1
147 Kotoko16189 192 223 278 294 309 390+ L2a1
148 Buduma88189 193 223 278 294 309 390+ L2a1
149 Kanembu111189 223 278 291 294 309 390+ L2a1
150 Kanembu112189 223 278 291 294 309 390+ L2a1
151 Kanembu113189 223 278 291 294 309 390+ L2a1
152 Arabs Chad87189 223 278 294 309 390+ L2a1
153 Arabs Shuwa172189 223 278 294 309 390+ L2a1
154 Fali105189 223 278 294 309 390+ L2a1
155 Buduma75213 223 278 291 294 309 390+ L2a1
156 Kotoko57223 234 278 294 309 390+ L2a1
157 Buduma74223 278 294 309+ L2a1
158 Arabs Chad135223 278 294 309 390+ L2a1
159 Buduma68223 278 294 309 390+ L2a1
160*Hide9 223 278 294 309 390+ L2a1
161 Kanembu70223 278 294 309 390+ L2a1
162 Kanembu73223 278 294 309 390+ L2a1
163*Hide12 223 278 286 294 309 390+ L2a1a
164*Mafa16 071A 114A 129 213 223 278 390+ L2b
165*Mafa20 114A 129 172 213 223 278 390+ L2b
166*Mafa33 114A 129 172 213 223 278 390+ L2b
167 Kanembu54114A 129 189 213 218 223 278 390+ L2b
168 Buduma92114A 129 213 218 223 274 278 390+ L2b
169*Hide46 114A 129 213 223 274 278 390+ L2b
170*Mafa35 114A 129 213 223 274 278 390+ L2b
171*Mafa37 114A 129 213 223 274 278 390+ L2b
172 Kanuri21114A 129 213 223 278 354 390+ L2b
173 Kanuri44114A 129 213 223 278 362 390+ L2b
174*Masa84 114A 129 213 223 278 362 390+ L2b
175*Kotoko36 114A 213 223 255 274 278 287 362 390+ L2b
176**Fulani Borgor47 114A 213 223 278 362 390+ L2b
177**Fulani Borgor38 114A 213 223 278 362 390+ L2b
178 Kanembu58114A 213 223 278 362 390+ L2b
179**Fulani Borgor3 114A 129 213 223 278 294 355 362 390+ L2b1
180**Fulani Borgor34 114A 129 213 223 278 294 355 362 390+ L2b1
181**Fulani Tcheboua117 114A 129 213 223 278 294 355 362 390+ L2b1
182**Fulani Tcheboua127 114A 129 213 223 278 294 355 362 390+ L2b1
183**Fulani Tcheboua130 114A 129 213 223 278 294 355 362 390+ L2b1
184*Mafa38 114A 129 213 223 278 355 362 368 390+ L2b1
185**Fulani Borgor35 223 278 390+ L2c
186**Fulani Tcheboua120 223 278 390+ L2c
187**Fulani Tcheboua137 223 278 390+ L2c
188**Fulani Tcheboua142 223 278 390+ L2c
189*Masa72 223 278 390+ L2c
190 Buduma79223 278 318 390+ L2c1
191 Fali107111A 145 184 189 223 239 278 290 292 355 390+ L2d2
192*Hide24 111A 145 184 189 223 239 278 290 292 355 390+ L2d2
193**Fulani Tcheboua126 111A 145 184 223 239 278 292 355 390+ L2d2
194 Buduma69111A 145 223 239 278 292 355 390+ L2d2
195 Buduma73111A 145 223 239 278 292 355 390+ L2d2
196 Buduma78111A 145 223 239 278 292 355 390+ L2d2
197 Buduma86111A 145 223 239 278 292 355 390+ L2d2
198 Buduma96111A 145 223 239 278 292 355 390+ L2d2
199*Mafa44 223 L3*
200 Arabs Shuwa168129 189 223 242G 311 326 359 L3*
201 Kanembu75189 223 291 327 358 L3*
202 Kotoko47210 224 278 L3*
203 Kotoko60223 261 294 L3*
204 Kanembu66223 355 L3*
205**Fulani Borgor43 086 124 223 278 311 362 L3b
206 Kanuri18093 124 223 278 311 362 L3b
207 Kanembu76124 169 223 278 311 362 L3b
208**Fulani Tcheboua147 124 182C 183C 189 223 278 362 L3b
209 Arabs Chad86124 183C 189 223 278 362 L3b
210*Masa83 124 189 223 278 311 362 L3b
211 Arabs Shuwa186124 189 223 278 362 L3b
212*Hide5 124 223 240T 362 L3b
213*Masa82 124 223 240T 362 L3b
214*Mafa36 124 223 278 L3b
215*Masa109 124 223 278 L3b
216*Hide4 124 223 278 291 L3b
217 Arabs Shuwa180124 223 278 311 362 L3b
218 Buduma94124 223 278 311 362 L3b
219*Kotoko12 124 223 278 311 362 L3b
220 Kotoko56124 223 278 311 362 L3b
221*Masa70 124 223 278 311 362 L3b
222*Masa107 124 223 278 311 362 L3b
223 Arabs Shuwa183124 223 278 362 L3b
224 Fali13124 223 278 362 L3b
225 Fali67124 223 278 362 L3b
226 Fali80124 223 278 362 L3b
227 Fali100124 223 278 362 L3b
228 Fali102124 223 278 362 L3b
229 Fali106124 223 278 362 L3b
230**Fulani Tcheboua112 124 223 278 362 L3b
231**Fulani Tcheboua116 124 223 278 362 L3b
232**Fulani Tcheboua128 124 223 278 362 L3b
233**Fulani Tcheboua129 124 223 278 362 L3b
234**Fulani Tcheboua134 124 223 278 362 L3b
235**Fulani Tcheboua149 124 223 278 362 L3b
236 Kanembu117124 223 278 362 L3b
237 Kanembu118124 223 278 362 L3b
238 Kanuri15124 223 278 362 L3b
239 Kanuri47124 223 278 362 L3b
240*Mafa17 124 223 278 362 L3b
241*Masa106 124 223 278 362 L3b
242 Arabs Shuwa181124 223 362 L3b
243*Mafa4 051 223 278 362 L3b1
244*Mafa54 051 223 278 362 L3b1
245**Fulani Tcheboua144 093 223 278 311 362 L3b1
246**Fulani Borgor4 093 223 278 362 L3b1
247**Fulani Borgor5 093 223 278 362 L3b1
248**Fulani Borgor8 093 223 278 362 L3b1
249**Fulani Borgor11 093 223 278 362 L3b1
250**Fulani Borgor18 093 223 278 362 L3b1
251**Fulani Borgor20 093 223 278 362 L3b1
252**Fulani Borgor25 093 223 278 362 L3b1
253**Fulani Borgor27 093 223 278 362 L3b1
254**Fulani Borgor40 093 223 278 362 L3b1
255**Fulani Borgor42 093 223 278 362 L3b1
256**Fulani Borgor50 093 223 278 362 L3b1
257**Fulani Tcheboua114 093 223 278 362 L3b1
258**Fulani Tcheboua132 093 223 278 362 L3b1
259**Fulani Tcheboua133 093 223 278 362 L3b1
260**Fulani Tcheboua139 093 223 278 362 L3b1
261**Fulani Borgor12 189 223 278 358 362 L3b1
262**Fulani Borgor9 189 223 278 362 L3b1
263 Arabs Shuwa164223 278 362 L3b1
264 Arabs Shuwa178223 278 362 L3b1
265 Arabs Shuwa170111 124 223 L3d
266 Kanembu64111 124 223 L3d
267 Kanuri43111 124 223 L3d
268 Arabs Shuwa91124 148 223 257 L3d
269*Hide17 124 166 223 L3d
270 Kanuri41124 166 223 L3d
271 Kanuri45124 166 223 L3d
272*Mafa29 124 166 223 L3d
273*Mafa32 124 166 223 L3d
274 Masa81124 189 223 L3d
275 Arabs Chad84124 223 L3d
276 Arabs Chad89124 223 L3d
277 Arabs Chad94124 223 L3d
278 Arabs Chad96124 223 L3d
279*Hide23 124 223 264 L3d
280 Arabs Chad88124 223 288 311 L3d
281 Kanuri30124 223 311 L3d
282*Mafa13 124 213 223 319 L3d1
283 Arabs Shuwa82124 223 319 L3d1
284 Arabs Shuwa167124 223 319 362 L3d1
285**Fulani Borgor2 124 223 319 362 L3d1
286**Fulani Borgor28 124 223 319 362 L3d1
287**Fulani Borgor37 124 223 319 362 L3d1
288**Fulani Tcheboua111 124 223 319 362 L3d1
289**Fulani Tcheboua140 124 223 319 362 L3d1
290 Kanuri29124 223 319 362 L3d1
291 Arabs Shuwa185086 124 223 256 368 L3d2
292*Mafa6 086 124 223 256 368 L3d2
293*Mafa10 086 124 223 256 368 L3d2
294**Fulani Borgor22 124 223 327+ L3e1
295 Arabs Shuwa179172 223 327+ L3e1
296**Fulani Borgor32 187 223 327+ L3e1
297*Hide11 223 327+ L3e1
298 Kotoko54223 327+ L3e1
299*Mafa25 223 327+ L3e1
300 Kotoko58192 223 256 325D 327+ L3e1b
301 Kanuri36223 256 325D 327+ L3e1b
302 Fali85192 223 278 311 320+ L3e2
303 Fali101192 223 278 311 320+ L3e2
304 Arabs Shuwa81223 311 320+ L3e2
305 Fali1223 311 320+ L3e2
306 Fali57223 311 320+ L3e2
307 Fali58223 311 320+ L3e2
308 Fali66223 311 320+ L3e2
309**Fulani Borgor17 223 311 320+ L3e2
310*Masa73 223 311 320+ L3e2
311 Arabs Chad150223 320+ L3e2
312 Kanuri48223 320+ L3e2
313 Kotoko51223 320+ L3e2
314**Fulani Tcheboua135 150 172 189 223 224 320+ L3e2b
315**Fulani Tcheboua145 150 172 189 223 224 320+ L3e2b
316 Arabs Shuwa165150 172 189 223 320+ L3e2b
317**Fulani Tcheboua123 172 183C 189 209 223 234 311 319 320+ L3e2b
318**Fulani Borgor41 172 183C 189 223 284 320+ L3e2b
319**Fulani Borgor19 172 183C 189 223 311 320 358+ L3e2b
320 Buduma72172 183C 189 223 320+ L3e2b
321 Kotoko22172 183C 189 223 320 362+ L3e2b
322 Kanuri12172 183C 223 320+ L3e2b
323 Arabs Shuwa84172 189 223 320+ L3e2b
324 Arabs Shuwa86172 189 223 320+ L3e2b
325 Arabs Shuwa89172 189 223 320+ L3e2b
326*Masa93 172 189 223 320+ L3e2b
327 Fali94172 189 223 320 364+ L3e2b
328*Mafa3 093 223 265T+ L3e3
329 Kanuri24223 265T+ L3e3
330*Hide25 037 041 223+ L3e5
331 Kotoko10037 041 223+ L3e5
332 Fali2041 086 223+ L3e5
333 Fali76041 086 223+ L3e5
334 Kanuri26041 093 192 223+ L3e5
335**Fulani Tcheboua150 041 129 223+ L3e5
336 Kanembu71041 189 223 343+ L3e5
337 Arabs Shuwa87041 192 223+ L3e5
338 Arabs Shuwa166041 192 223+ L3e5
339 Kotoko61041 192 223+ L3e5
340*Masa101 041 192 223+ L3e5
341 Kanuri33041 221 223+ L3e5
342 Fali71041 223+ L3e5
343 Fali74041 223+ L3e5
344 Fali86041 223+ L3e5
345 Fali87041 223+ L3e5
346**Fulani Borgor45 041 223+ L3e5
347*Hide32 041 223+ L3e5
348 Kotoko28041 223+ L3e5
349 Kotoko32041 223+ L3e5
350 Kotoko34041 223+ L3e5
351 Kotoko39041 223+ L3e5
352*Kotoko42 041 223+ L3e5
353 Kotoko44041 223+ L3e5
354*Kotoko49 041 223+ L3e5
355 Kotoko62041 223+ L3e5
356*Masa110 041 223+ L3e5
357*Kotoko9 041 223+ L3e5
358*Hide7 041 223 254+ L3e5
359*Hide8 041 223 254+ L3e5
360*Mafa14 041 223 261+ L3e5
361*Mafa30 041 223 261+ L3e5
362*Mafa11 041 223 261+ L3e5
363 Arabs Shuwa80041 223 278+ L3e5
364 Fali78041 223 291+ L3e5
365*Masa79 041 223 291 298+ L3e5
366 Kanembu67041 223 325+ L3e5
367 Kanembu104041 223 343+ L3e5
368 Fali3223 311 320+ L3e5
369 Fali92223 311 320+ L3e5
370*Kotoko29 209,223,311 L3f
371*Masa91 053 188 209 223 L3f
372 Kanembu74093 209 223 301 311 L3f
373**Fulani Tcheboua148 189 209 223 311 L3f
374 Buduma81209 223 L3f
375 Buduma91209 223 L3f
376 Fali99209 223 259 L3f
377 Kotoko25209 223 311 L3f
378*Kotoko31 209 223 311 L3f
379 Kotoko37209 223 311 L3f
380 Fali60129 209 223 292 295 311 L3f1
381*Hide22 184 209 223 259 292 311 L3f1
382 Kanuri6209 214 223 292 311 L3f1
383 Kanembu115209 223 255 292 311 L3f1
384 Kanuri34209 223 266 292 311 L3f1
385 Kanembu52209 223 292 305 311 L3f1
386 Arabs Chad77209 223 292 311 L3f1
387 Arabs Shuwa177209 223 292 311 L3f1
388 Kanembu69209 223 292 311 L3f1
389 Kanembu108209 223 292 311 L3f1
390 Kanuri54209 223 292 311 L3f1
391 Kotoko24209 223 292 311 L3f1
392 Kotoko33209 223 292 311 L3f1
393*Kotoko41 209 223 292 311 L3f1
394*Kotoko50 209 223 292 311 L3f1
395*Kotoko64 209 223 292 311 L3f1
396*Mafa48 209 223 292 311 L3f1
397*Masa74 209 223 292 311 L3f1
398*Masa92 209 223 292 311 L3f1
399*Masa97 209 223 292 311 L3f1
400 Buduma71086 209 223 L3f2
401*Kotoko45 139 176 209 223 234 L3f2
402 Buduma95176 188 209 223 234 L3f2
403 Kanuri49176 188 209 223 234 L3f2
404*Kotoko2 176 188 209 223 234 L3f2
405 Buduma85176 188 209 223 234 278 L3f2
406 Kotoko15176 209 218 223 234 L3f2
407**Fulani Tcheboua121 176 209 223 L3f2
408*Masa89 176 209 223 234 L3f2
409*Masa96 176 209 223 234 L3f2
410 Kotoko8176 209 223 234 L3f2
411 Kotoko14176 209 223 234 L3f2
412 Buduma90176 209 223 234 284 L3f2
413*Masa95 176 209 223 234 284 L3f2
414*Mafa8 176 209 223 234 294 L3f2
415 Kotoko23188 209 215 223 234 L3f2
416 Kotoko11179 192 215 223 235T 256A 284 311 L3h1
417 Kotoko38179 192 215 223 235T 256A 284 311 L3h1
418*Kotoko40 179 192 215 223 235T 256A 284 311 L3h1
419 Kanuri42179 192 215 223 256A 284 311 L3h1
420 Kanembu53093A 223 293T 301 311 355 356 362 L4g
421 Kanuri40188 189 209 223 274 292 293T 311 316 335 355 362 L4g
422 Kanembu101093A 223 287A 293T 301 311 355 356 362 L4g2
423*Hide16 093G 189 223 287A 293T 311 355 362 L4g2
424 Buduma93093G 223 287A 293T 301 311 355 362 L4g2
425 Kanembu60111 129 148 166 187 189 223 254 278 355 360+ L5
426 Buduma80129 183C 189 223 249 311 M1
427 Buduma87129 183C 189 249 311 M1
428 Buduma89129 183C 189 249 311 M1
429 Arabs Chad92129 189 249 311 M1
430 Arabs Shuwa162129 189 223 249 311 359 M1a1
431 Arabs Shuwa173129 189 223 249 311 359 M1a1
432 Arabs Shuwa174129 189 223 249 311 359 M1a1
433 Arabs Shuwa176129 189 223 249 311 359 M1a1
434 Buduma83172 183C 189 223 249 250 320+ M1a2
435 Fali30CRS R
436 Arabs Chad95343 U3
437 Buduma76343 390 U3
438 Arabs Shuwa78356 U4
439 Arabs Shuwa79356 U4
440 Kanembu62192 212 298 311 U5a
441**Fulani Borgor21 189 192 270 320 U5b1
442**Fulani Borgor23 189 192 270 320 U5b1
443**Fulani Borgor24 189 192 270 320 U5b1
444**Fulani Borgor33 189 270 320 U5b1
445 Kanuri53172 219 278 U6
446*Mafa1 172 219 311 U6
447 Arabs Shuwa169172 290++U6
448**Fulani Tcheboua124 298 V
Figure 2.

Haplogroup frequencies in different African regions.

Haplogroup L0a is mainly of East African origin, diversifying there around 40,000 years ago (Salas et al. 2002). Its major derived sub-clades (L0a1, L0a2 and L0a1a) spread into Central and South–Eastern African regions. Although it is believed that these were brought into the latter regions mainly by the eastern stream of the Bantu expansion, the role of the western Bantu stream is still uncertain (Salas et al. 2002), although some details are beginning to emerge (Plaza et al. 2004; Beleza et al. 2005). L0a1 and its main sub-clade L0a1a are represented by 10 haplotypes (∼6% of the mtDNAs) in the Chad Basin. Seven of these 10 haplotypes (13/28 of the L0a mtDNAs observed) are found in the three Chadic-speaking populations of the Kotoko, Mafa and Masa at relatively high frequencies (11%). Eight of these haplotypes match with other neighbouring western Central African samples. The most frequent type (matching the basal motif of L0a1; sensuSalas et al. 2002) is frequent in, for example, Nubians, but is also found in other East African populations (e.g. in the Sudan and Turkana). L0a is absent in Shuwa Arabs and both Fulani sample sets. Several representatives of the sub-clade L0a1a (which is identified in the HVS-I region by mutations C16168T and C16278T the top of L0a [N= 15]) were detected. The Chad Basin L0a1 types (with the exception of the basal L0a1 type) show indications of some differentiation in situ, as they are close derivatives of the pre-existing L0a1 types found in East (and/or South-East) Africa. L0a2, probably of Central African origin (Soodyall & Jenkins, 1993; Salas et al. 2002), was not observed in the Chad Basin samples used in this study.

Haplogroup L1b, which probably spread into Central and North Africa along the Atlantic coast line, seems to be of West African origin. L1b is represented by eight different haplotypes (36 mtDNAs). It is highly prevalent in both the Fulani groups (29% in the sample from Chad and 20% in the sample from Cameroon), but also in the Kanembou (12%) and Fali (10%); it is found sporadically in some other ethnic groups. L1b is completely absent in the three Chadic speaking groups (the Hide, Masa and Buduma) and in the Arabs from Chad. The L1b haplogroup occurs only in the form of L1b1 defined by the (mutationally unstable) variant A16293G. Most of the Chad Basin L1b types match, or are close derivatives, of West African types.

The history of haplogroup L1c still remains enigmatic (Beleza et al. 2005; Plaza et al. 2004; Richards et al. 1993; Salas et al. 2002, 2004a, 2005b). The present data seem to point to (somewhere in) Central Africa as the ‘cradle’ of L1c, with very restricted overlap into south–eastern areas (Salas et al. 2002; Destro-Bisol et al. 2004). L1c in the Chad Basin is represented by eight haplotypes (10 sequences); these are found in all three linguistic families except for the Arab groups. This Central African haplogroup occurs in the Chad Basin in the form of different haplotypes containing mutation A16293G (L1c1; note that this definition is provisional [Salas et al. 2002] since there is evidence indicating that A16293G is not an appropriate diagnostic site due to its high mutation rate), but also in those bearing transition A16215G (L1c3). It is important to note that the L1c2 haplotypes, occurring predominantly in Americans of African origin, were not observed in the samples from the Chad Basin. This seems to indicate, in accordance with previous studies (Salas et al. 2002, 2004a), that the contribution of the Chad Basin (in contrast to the western and south-western Atlantic façade) to the African-American mtDNA pool was probably very limited.

The L5 and L1f haplogroups are geographically restricted to the East African region, where their origins are also expected to lie. Only one L5 haplotype was found in the Chad Basin samples (a single Kanembou individual); it occurs in the form of L5a (L1e1 in Figure 5 of Salas et al. 2002) with mutations C16111T, A16254G, C16355T and C16360T.

Haplogroup L2 is commonly divided into four main branches, termed L2a, L2b, L2c and L2d (Bandelt et al. 2001; Torroni et al. 2001b), of which L2a is the most numerous and most widespread within Africa. Where L2a diversified is still an open question; it could have been in West, Central or East Africa. The origin of the remaining known L2 clades (L2b, L2c and L2d) is unambiguously in West Africa (Salas et al. 2002). L2a is represented in the Chad Basin, as it is everywhere in sub-Saharan Africa, by a large number of sequences (N= 78). Of the previously known clades of this haplogroup, only L2a1 (identified by the mutationally unstable variant A16309G; N= 35, 16 haplotypes) was detected and only one sequence containing mutation C16286T could further be classified as L2a1a. L2a is particularly abundant in the Kanembou (38% of the sample), but is also relatively frequent in nomadic Arabs (33%). It is absent from the Fulani sample from Cameroon, and is found at a relatively low frequency (6%) in the Fulani population from Chad. It is important to note that most of the Chad Basin L2a types do not match either the West or East African L2a types. This again suggests some diversification of this clade in situ. Positions T16209C C16301T C16354T on top of L2a1 define a small sub-clade, dubbed L2a1c by Kivisild et al. (2004, Figure 3) (see also Figure 6 in Salas et al. 2002), which mainly appears in East Africa (e.g. Sudan, Nubia, Ethiopia) and West Africa (e.g. Turkana, Kanuri). In the Chad Basin four different L2a1c types, one or two mutational steps from the East and West African types, were identified.

Figure 3.

Network of (a) haplogroup L3e5 and (b) L3f2. Brackets indicate variants outside the common HVS-I segment 16090-16365 (shared by all the samples). A “#” indicates the (tentative) root type. Parallel mutations are underlined. Circle sizes are proportional to the haplotype frequency in the sample. Population codes: Kn1 = Lake Chad Kanuri, Tk = Turkana (Watson et al. 1997b); Pr = Portugal (Pereira et al. 2000); Mn = Mandara, Ou = Ouldeme, Pd = Podokwo, Tp = Tupuri, Fu = Fulbe, Tl = Tali (Coia et al. 2005); Fl = Fali, FB = Fulani Borgor, Hi = Hide, Ko = Kotoco, Ms = Masa, Mf = Mafa, As = Arabs Shuwa, Kb = Kanembu, FT = Fulani Tcheboua, Kn2 = Kanuri, Bu = Buduma (present study and Černýet al. 2005; see footnote to Table 1); Mr = Morocco (Brakez et al. 2001), Ag = Algeria, Morocco Arabs, Tn = Tunisia (Plaza et al. 2003); Mt = Matmata, Sn = Tunisian Berbers (Fadhlaoui-Zid et al. 2004); Can = Canary Islands (Rando et al. 1999); Sr = Serer (Rando et al. 1998); Ice = Iceland (Sajantila et al. 1995); Ke = Kenya (Brandstätter et al. 2004); Sw = Switzerland (Dimo-Simonin et al. 2000); Ng = Nigeria (Torroni et al. 2006); Sou = Morocco, Souss Valley (Brakez et al. (2001); Su = Sudan (Krings et al. 1999); Sic = Sicily (Forster et al. 2002). For the L3e5 phylogeny, there are a few sequences displayed in Figure 2 of Fadhlaoui-Zid et al. (2004) that the authors of the present study could not trace to their original references; for this reason these sequences were not included in Figure 3b. The arrows in both Figures 3a and 3b indicate the two complete genomes taken from Torroni et al. (2006; see text for more information).

There is another small branch deriving from the basal L2a type that could tentatively (and consistent with Kivisild et al. 2004) be termed L2a1d. This small clade would be defined by positions T16189C C16291T T16311C T16229C on the top of L2a (see Figure 6 in Salas et al. 2002), and is found in East Africa and also Central Africa. In the Chad Basin we found five L2a1d derivatives lacking mutation T16189C. Both L2a1c and L2a1d abound in all of the linguistic branches of the Chad Basin population samples analyzed.

L2b is represented by 11 haplotypes (20 mtDNAs). This West African haplogroup is absent in both Arab groups and in the Fali; its highest frequency was detected in the Mafa (19%). L2b is also frequent in both Fulani sample sets where it occurs mainly as clade L2b1, differentiated by mutations C16355T and T16362C. Chad Basin L2b types do not, for the most part, match West African types. This is also consistent with the apparent absence of the L2b Chad Basin types in America (in contrast to the West African L2b types).

A total of 17 sequences were classified as L2c or L2d. These West African haplogroups occur in relatively higher frequencies not only in both of the Fulani sample sets, but also in the Buduma sample set (mainly matching the basal type which is highly prevalent in West Africa). They are virtually absent in both Arabic groups, both Nilo-Saharan groups, the Kotoko and the Mafa.

Haplogroup L3A (L3 without M and N) is most frequent in East Africa (∼50%), but can also be found in other parts of the continent. It is divided into several highly diversified sub–haplogroups, of which L3f and L4 are characteristic of East Africa, while L3b and L3d are specific to West Africa. The most diversified, the most extended, the most numerous and probably the oldest of the L3A types is haplogroup L3e, which dates back 46,000 years (Bandelt et al. 2001). We find that in the Chad Basin the highest number of sequences (N= 88) fall into the West African haplogroups L3b and L3d. They are not missing from any population, but are abundant in both of the Fulani sample sets, where they occur in more than 30% of samples. The lowest frequency of these West African haplogroups was identified in the Buduma (3%). The most prevalent L3b types in the Chad Basin match the basal type characterised by T16124C C16223T C16278T T16362C.

L3e is represented by 35 haplotypes (N= 75). It was detected in all of the Chad Basin population sample sets, and mainly in the Fali where it is found at a rate of 40%. Relatively high frequencies of L3e were found in the Kotoko and Shuwa Arabs (25% and 24%, respectively). Within this haplogroup representatives of L3e1 (determined in HVS-I by mutations C16223T C16327T) and its subclade L3e1b (deletion of T16325C on top), the clade L3e2 (C16223T C16320T) and its subclade L3e2b (16172–16189 on top), and the clade L3e3 (16223–16265T), can be detected in the Chad Basin samples. On the other hand we did not find L3e1a (16185–16311) and L3e4 (16223–16264) in this region. Consistent with other L-types in the Chad Basin, most of the American L3e1 and L3e2 types do not match those found in the Chad Basin, and mainly match West African types. It is also interesting to note that some L3e2 (and perhaps also L3e1) types are probably of Central African origin.

We also found a large set of L3e sequences in our samples (−3592 HpaI and +2349 MboI) carrying mutations A16041G C16223T (N= 39). These mtDNAs appear in all the Chad Basin populations, with the exceptions of the Arabs from Chad and the Buduma from Niger; their occurrence in Fulani samples is also very low. A search through our database of more than 6500 African mtDNA HVS-I profiles revealed 52 other sequences carrying mutations A16041G C16223T. We also note that these sequences are probably related to the complete Nigerian genome (#18) reported by Torroni et al. (2006), which carries the HVS-I variants A16037G A16041G C16223T T16311C T16519C and A73G C150T A263G 315+C T398C 523–524del in HVS-II. We named this branch L3e5. Note that this clade was detected earlier (but left unnamed) by Fadhlaoui-Zid et al. (2004), and was suggested to be of North African origin since “no match was found with sub-Saharan populations” (p. 230; Fadhlaoui-Zid et al. 2004). We therefore assign this minor subclade as L3e5, defined now by the basal HVS-I motif A16041G C16223T, and the sub-clade L3e5a defined by A16037G on top. The L3e5 network of Figure 3a reflects a clear star-like phylogeny. Most of the L3e5 types are found in western Central Africa, although there seems to be important diffusion into North Africa (interpreted by Fadhlaoui-Zid et al. (2004) as evidence for the autochthonous character of L3e5 in North Africa); the root type is relatively prevalent in the Chad Basin populations, and there are plethora of derived haplotypes (with nearly no matches in West Africa), indicating that L3e5 evolved in situ in this region. Taking the root type of L3e5 as the founder in western Central Africa (and more specifically in the Chad Basin), we estimate an expansion for this clade at about 11,450 ± 3,650 Y.B.P. in this region. In North Africa this clade seems to be more recent (with a larger standard deviation), dating to 7,100 ± 3,800 Y.B.P. There are only three West African L3e5 types, two of which match the root type while the other is found in the Serer of Senegal (Rando et al. 1998). There is absolutely no African-American mtDNA belonging to L3e5, which contrasts with the high prevalence of other L3e types in African-Americans, e.g. see Figure 9 in Salas et al. 2002; this again seems to indicate that the Chad Basin did not contribute significantly to the Atlantic slave trade.

Another well-represented haplogroup in the Chad Basin sample sets is L3f (defined in HVS-I by T16209C C16223T T16311C). Interestingly, an important number of sequences carry the extra variant C16176T (and most of them carried C16234T), while lacking the L3f diagnostic site T16311C (Figure 3b). This could constitute a new sub-clade of L3f; tentatively dubbed L3f2 here. The presence of C16188T on top of the L3f2 diagnostic motif would define L3f2a. L3f2a is probably related to the L3f complete genome of Nigerian mtDNA (#18) reported in Torroni et al. (2006; their Figure 1). Curiously, the HVS-II region of this Nigerian nearly matched a sample from Switzerland reported in Dimo-Simonin et al. 2000 (#142), sharing HVS-II positions 073-143-189-318 (some of which probably constitute part of the HVS-II diagnostic motif of L3f2). These sequences also mutate back at site T16311C, which probably constitutes a parallel mutation within L3f (note that this position is relatively unstable, e.g. Bandelt et al. 2002; Malyarchuk & Rogozin, 2004).

The phylogeography of L3f is particularly interesting. This haplogroup is probably of East African origin (Salas et al. 2002), while L3f1 appears to have spread at an early date into West Africa (and is also well represented in African-Americans; Salas et al. 2002) and probably into the Arabian Peninsula (Kivisild et al. 2004). The L3f2 root type is found in two western Central African sequences with two derived mtDNAs from East Africa. L3f2 is however found exclusively in western Central Africa (N= 17), and L3f2a is mostly found in this region as well. It is interesting that L3f2 diversification occurs almost exclusively in Chadic-speaking groups (the Chadic branch of the Afro-Asiatic linguistic family); note that these groups together constitute only 38% of our Chad Basin sample. It should also be noted that other published sub-Saharan sequences from the populations of North Cameroon (the Ouldeme, Podokwo and Mandara) belong to the Chadic group as well. If it is tentatively assumed that the root of L3f2 (16176-16209-16223-16234) constitutes a founder type, the TRMCA of this clade in western Central Africa would be 28,950 ± 11,600 Y.B.P, contemporary with L3f1 (Salas et al. 2002). The ‘double’ star-like shape of the L3f2 phylogeny suggests, however, the existence of at least two different expansion events, one of them affecting L3f2*. This assumption leads the authors to tentatively estimate an expansion event in the Chad Basin around 15856 ± 5943 Y.B.P. Consistent with the phylogeography of other clades there are no African-American L3f2 types. The introduction of L3f2 into North Africa is limited (Figure 3b), as is also true for other L3f lineages. The only East African L3f2 detected is within L3f2a, and corresponds to a Sudanese individual.

Haplogroup L4 occurs only in very small numbers, as predicted by the hypothesis formulated in Salas et al. (2004b). One Buduma matches the common L4g2 type in western Central Africa (matching individuals from many different western Central African populations, such as the Daba, Fali, Mandara, Podokwo, etc.). There are three other Chad Basin mtDNAs belonging to L4g2.

Finally, haplogroups L0d and L0k have been detected almost exclusively in the Khoisan people of southern Africa and in neighbouring Bantu populations, e.g. in Mozambique (Pereira et al. 2001; Salas et al. 2002); it is very likely that these are the last remnants of formerly more numerous and more diversified haplogroups that did not survive the period of Bantu expansion (Vigilant et al. 1991). As expected, these lineages were completely absent in the Chad Basin samples.

Principal Component Analysis

PCA is the usual method for summarising population relationships; here the intention is to observe the mtDNA patterns of the Chad Basin populations within the general African landscape. The pattern of PCA1 (Figure 4a) reflects the close relationship between the Chad populations and those of western Central and West Africa, and shows a clear separation from East Africa and an even greater distance to North and South Africans. The PCA2 plot reflects the proximity of some Chadic-speaking populations (e.g. the Kotoko and Buduma) to East Africa (reflecting the higher frequency of the L0a and, for example, L3f lineages in the Kotoko and L3f and M1 mtDNAs in the Buduma), while others are more closely related to West Africa (e.g. the Borgor Fulani and the Tcheboua Fulani), reflecting the prevalence of some typical West African lineages, e.g. L1b, L3b/d. PCA3 accentuates the distances of all the populations from North and South Africa as well; these regions behave as outliers, reflecting on the one hand chiefly the Khoisan component in South Africa, and on the other the European character of North Africa. PCA1, PCA2 and PCA3 account for 22.7%, 18.4% and 15.7% of the total variation, respectively.

Figure 4.

(a) Plot showing the first, second, and third principal components of the haplogroup frequency profiles for the African samples. HI = Hide, KO = Kotoko, MS = Masa, MF = Mafa, AS = Shuwa Arabs, AC = Chad Arabs, KB = Kanembu, FT = Tcheboua Fulani, FB = Borgor Fulani, KN = Kanuri, BU = Buduma, FL = Fali, SA = South Africa, EA = East Africa, NA = North Africa, SW = South-West Africa, SE = South-East Africa, WCA = western Central Africa, WA = West Africa. (b) MDS plot showing the general location of the Chad Basin samples in the context of 48 selected sub-Saharan populations (see the text). Colours: orange = SA populations, red = EA populations, pink = WCA populations, blue = Chad Basin populations, green = WA populations. The populations analysed in this study (which include those from Černýet al. 2004 and Černýet al. 2006) are labelled as in Figure 3a. Populations are numbered as in table S2. (c) The first and second MDS dimensions of the mtDNA variability displayed by the Chad Basin populations in the context of the main African regions.

Multidimensional Scaling

MDS plotting of FST genetic distances obtained from 93 pairwise population comparisons revealed the outlying positions of North African and some (mainly Khoisan) sub-equatorial and Pygmy populations, without providing any clear visualisation of the relationships between the Chad Basin populations and their neighbours (data not shown). To better understand the genetic pattern of the area under investigation, we further analysed 60 selected populations. In addition to the aforementioned outliers, island populations (e.g. Bioko) were also excluded, as their contribution to the continental groups was minimal and they are likely to be susceptible to founder effects and drift after the initial settlement of their territories. Figure 4b clearly shows the homogeneity of West African populations (at the upper left hand side of the plot) on the one hand, and the dispersion of the East and South African groups on the other. The Chad Basin populations are situated somewhere in the middle. The two Arab samples, the Kotoko, the Mafa, the Masa, the Kanuri and the Buduma are closer to the East African populations living in or near the Ethiopian highlands; the others (both Fulani sample sets, the Hide, the Fali and the Kanembou) are linked rather to the West African group. In respect of the second dimension (vertical axis), however, the Chad Basin populations display no intelligible geographic or linguistic orientation. Figure 4c shows the MDS plot of the Chad Basin populations in the context of the main African regions; it mirrors in its first and second dimensions the pattern displayed by the PCA.

Apportioning of Genetic Variance

We carried out analyses of molecular variance (AMOVA) on the 12 Chad Basin populations analysed in the present work. It was observed that most of the genetic variation (∼96%) occurs within the populations, and that the variation between populations accounts for a non-negligible ∼4% (P < 0.000; 20000 permutations). If it is taken into account the fact that the FST value for the whole African continent is 0.12, then it can be said that the Chad Basin populations show a relatively high level of genetic homogeneity.

Discussion

Africa is typically divided into broad areas, namely North, West, East, western Central (or Central), South-East, South-West, and Southern (e.g. Salas et al. 2002, 2004a). The Chad Basin is represented in today's mtDNA database by populations of Kanuri, Fulbe and Hausa sampled mainly from northern Nigeria and southern Niger (Watson et al. 1997); analysis of these samples seems to indicate a closer mtDNA affinity to the West African mtDNA gene pool than to that of East Africa (Salas et al. 2002).

We have aimed here to shed light on the role played by Lake Chad in the population history of the Sahel-Sudan belt of Africa, by analysing the mtDNA heritage of 12 different ethnic groups from this region. The genetic differentiation measured by FST distances in Chad Basin groups from neighbouring populations in East, western Central and West Africa is relatively small. This continual and geographically determined Sahel-Sudanic mtDNA landscape supports the idea of a weak linguistic contribution to mtDNA history in this part of world. More pronounced differentiation is reported only for the continuously migrating Fulani nomads (who are more closely related to the West African pool), as they are not statistically differentiated from Guinea-Bissau populations now living more than 3,000 km away (Černýet al. 2006).

All of the analyses carried out in the present study point to the close relationship between the Chad Basin populations and western Central Africa, with a close affinity to West as well as to some East African features. The PCA (Figure 4a) and the MDS (Figure 4b) plots summarise this general pattern. AMOVA also indicates a close relationship between the Chad Basin populations, with only 4% accounting for variation between populations. The percentages of shared haplotypes between the Chad Basin and the main African regions (Table 2) also accord with this general scenario.

The haplogroup profiles of the Chad Basin populations revealed a somewhat unexpected absence of some clades, such as the Bantu haplogroup marker L0a2 which is highly prevalent in South-East Africa as well as in western Central Africa (and, unsurprisingly, in Americans of recent African ancestry; Salas et al. 2002, 2004a, 2005b,d). The absence of the L0a1 clade in both of the Fulani sample sets, and their close resemblance to the West Africans from Guinea-Bissau, adds support to the theory of an East African origin for the L0a haplogroup. The presence of some L0a* and L0a1 mtDNAs in the Chad Basin, however, indicates some ancient connection with East Africa. The absence of mtDNA belonging to L1b* in the Chad Basin is more enigmatic. L1b1 is its most important ‘daughter’ haplogroup, and has been detected mainly in West Africa (and consequently in African-Americans). The occurrence of L1b1 in both of the Fulani Chad Basin sample sets at a high frequency (over 20%) accords with their West African origin. L1c is represented by the L1c1 and L1c3 clades, but the L1c2 is absent from the Chad Basin database. The most likely homeland of the L1c haplogroup is western Central Africa (e.g. the Congo delta), from where some of its two main clades reached West and South-East Africa. The youngest clade, L1c2, was carried to South-East Africa only by the Bantu migration. The low frequency of L1c in the Chad Basin populations indicates that this region was partially isolated from more equatorial western Central African populations. The very rare L5a haplogroup has been reported to be virtually restricted to East Africa alone; we have found a (Kanembu) representative living East of Lake Chad and so Kanem might thus be considered the most western extent of this haplogroup. It is interesting that the second clade, L5a (L1e2 in Salas et al. 2002), which is otherwise more widespread, was not found in our samples.

Some other haplogroups appear in the Chad Basin at relatively high frequencies. Thus L2a1, which is widely distributed all around sub-Saharan Africa, is also very well represented in the Chad Basin (but only one L2a1a sequence was identified). The lack of L2a1b is not surprising at all, as this clade has a more or less south-eastern distribution (Richards et al. 2004; Salas et al. 2002). Both L2b* and L2b1 are represented in the Chad Basin. Its West African origin is supported here by its presence in Fulani samples, mainly in the form of L2b1, which is otherwise much more confined to West Africa than L2b*. Representatives of L2c and L2d were also found - the L2d, in the form of L2d2 (defined by 111A transversion and four additional mutations G16145A C16239T C16292T C16355T), is present at a high frequency in the Buduma. Phylogenetic reconstruction and the phylogeographic patterns of the sub-clades L3f2 and L3e5 indicate their (probable) autochthonous origin in the Chad Basin, with some sporadic representatives in other parts of Africa in the case of L3f2, and a significant frequency in North Africa for some L3e5 types. They both seem to reflect the existence of pre-historical population expansions, as indicated by the star-like appearance of their phylogenies. One of these appeared before the last glacial maximum at about 28,950 ± 11,600 (L3f2); there is however evidence for a population expansion 15,856 ± 5943 Y.B.P. for L3f2*. L3e5 shows diversification at about 11,450 ± 3,800 Y.B.P. During the last glacial maximum the Chad Basin experienced a dry period known as the kanémien, but with the beginning of the Holocene Lake Megachad formed and the Chad Basin became a suitable place to find new foraging (fishing and hunting) opportunities. Some connections from North Africa can also be seen in the archaeological record at this time (Haour, 2003); such evidence also fits well with the expansion event detected for L3e5 from this region around 7,100 ± 3,800 Y.B.P. Later, some pastoral groups also entered the Chad Basin. The archaeological data from western parts of the Republic of Sudan, especially from the shores of the Wadi Howar (a tributary of the Nile), suggest an important human migration to the Chad Basin from the Upper Nile valley, somewhere between the third and fourth cataracts (Keding 1993; Blench 1999). This new colonisation of eastern parts of the Chad Basin traversed large river valleys with some expanses of water. It is, then, possible that one of these demographic expansions, detectable in the still sparse archaeological record of the Chad Basin, was responsible for the star-like phylogenetic shape of L3f2* and L3e5.

It is interesting that the analysis of our population samples from Chad Basin has enabled the identification of the most likely geographical origin of L3f2 on the sub-Saharan side, in contrast to the North African geographic origin suggested by Fadhlaoui-Zid et al. (2004). The fact that these clades (L3f2 and L3e5) have not yet been found in East Africa may indicate that the human population(s) in which these diversifications occurred remained - with the ongoing drying of the climate - isolated from related East African groups. The phylogeographic characteristics of most of the Chad Basin, typically the East and West L-lineages, indicate diversification in situ, providing evidence from the sharing of the most prevalent (preferably basal) types, but not the sharing of the (one or more mutational step) derived mtDNAs.

In brief, the history told by the mtDNA seems to indicate that the Chad Basin has a mainly western Central African background; this is indicated by several analyses carried out in this study (shared haplotypes between regions, PCA, etc.), but especially by the phylogeographic patterns observed. The Chad Basin was also the epicenter of a bidirectional ‘genetic’ corridor between West and East Africa, favouring the input of West African types into the Chad Basin - probably due to its geographical proximity. The low frequency of the autochthonous North African U6 haplogroup in the Chad Basin populations testifies to the limited influence of North Africa in the region. Some lineages (L3f2* and L3e5) mirror the existence of demographic expansion events in the region, dated to about 15,856 ± 5,943 to 11,400 Y.B.P. In addition, all the evidence points to the Chad Basin having been set apart from the African scenario of the Atlantic slave trade.

It is interesting that the Arab-speaking populations fit well (e.g. PCA and MDS) with the Chad Basin sub-Saharan variability, despite their distinctive phenotypical appearance. On the other hand, other non-sedentary peoples of the Chad Basin – the Fulani nomads – are clearly differentiated (mainly because of high frequencies of the L1b and L3b/d haplogroups) from the most general Chad Basin mtDNA gene pool.

It is to be expected that the genetic exploration of still unknown areas of Eastern Chad and Western Sudan will in the near future enable the drawing aside of the veil on other fascinating stories in this remote part of Africa.

Acknowledgments

The authors wish to express their gratitude to the Chad Basin volunteers for their helpful participation in the study. We would also like to thank the two anonymous referees of this article for their most useful comments. This project was supported by the Grant Agency of the Czech Republic (under grant no. 404/03/0318), the Andrew W. Mellon Foundation through the Council of American Overseas Research Centers, Washington, DC, and the Fondation Maison des Sciences de l′Homme in Paris. This work was partially supported by grants from the Ministerio de Sanidad y Consumo (PI030893; SCO/3425/2002), Fundación Investigación Médica Mutua Madrileña Automovilística, and Genoma España (CeGen; Centro Nacional de Genotipado) given to AS.

Ancillary