Mitochondrial DNA Variation in Mauritania and Mali and their Genetic Relationship to Other Western Africa Populations

Authors


*Corresponding author: A.M. González, Genetics, Biology, University of La Laguna, 38271 Tenerife, Canary Islands, Spain, Tel.: 34-922-31 83-50, Fax: 34-922-31-83-11, E-mail: amglez@ull.es.

Summary

Mitochondrial DNA (mtDNA) variation was analyzed in Mauritania and Mali, and compared to other West African samples covering the considerable geographic, ethnic and linguistic diversity of this region. The Mauritanian mtDNA profile shows that 55% of their lineages have a west Eurasian provenance, with the U6 cluster (17%) being the best represented. Only 6% of the sub-Saharan sequences belong to the L3A haplogroup a frequency similar to other Berber speaking groups but significantly different to the Arabic speaking North Africans. The historic Arab slave trade may be the main cause of this difference. Only one HV west Eurasian lineage has been detected in Mali but 40% of the sub-Saharan sequences belong to cluster L3A. The presence of L0a representatives demonstrates gene flow from eastern regions. Although both groups speak related dialects of the Mande branch, significant genetic differences exist between the Bambara and Malinke groups. The West African genetic variation is well structured by geography and language, but more detailed ethnolinguistic clustering suggest that geography is the main factor responsible for this differentiation.

Introduction

West Africa encompasses considerable geographic, ethnic and linguistic diversity. From north to south three successive bands known as the Sahel, the Sudan and the Guinea move from the desert through the savannah to the moist forest. Past climatic fluctuations modified the transition borders widely, forcing people to migrate and to re-adapt to new places. Very little is known of the Prehistory of West Africa, although in open lands, some signs of an early North African Aterian influence has been detected. However, since 12,000 ya two regional tool-making traditions have been discerned. A non-microlithic culture related to the Pygmies is detectable on the Guinea fringe, and a more elaborate microlithic industry, associated with tall and slender Negroid types, appeared in the Sahel and Sudan bands (Newman, 1995). Agriculture and herding came to the area around the third millennium BC. West Africa may have been the first part of the continent to experience an important population increase from farming (Cavalli-Sforza et al. 1994). Demographic increase and posterior cultural advances such as those involving iron may have contributed to the rise of the historic West African empires of Ghana, Mali and Songhay (Newman, 1995). Two of the four African language phyla are present in West Africa. Afroasiatic, with its Berber and Chadic branches, covers the Sahara-Sahel, and the Niger-Congo with its Atlantic, Mande, Kwa and Voltaic branches spreads into the Sudan and Guinea bands. These branches further split into hundreds of dialects spoken by different ethnic groups. From this summarized geographic, archaeological and linguistic information it can be deduced that, undoubtedly, geographic constraints played a major role in this ethnic and linguistic fragmentation, but that focalised cultural improvements allowed better adaptation and, consequently, demographic growth and expansions. From a population genetics perspective it is crucial to determine if the levels of gene flow that accompanied those cultural advances were sufficient to homogenize the interpopulation genetic variation accumulated by geographic isolation. Genetically, the area has been only moderately studied. Based on classical markers at least two independent demic expansions have been proposed; one in western Senegal and one in the Niger-Mali-Burkina Faso region (Cavalli-Sforza et al. 1994). At the molecular level studies have concentrated on mitochondrial DNA (mtDNA), a locus that shows distinct geographic patterns in Africa (Watson et al. 1997; Salas et al. 2002), but have mainly focused on specific countries such as Senegal (Graven et al. 1995; Rando et al. 1998), Niger-Nigeria (Watson et al. 1997), Guinea-Bissau (Rosa et al. 2004), Cameroon (Coia et al. 2005) or Sierra Leone (Jackson et al. 2005). In the present article we add to the mitochondrial information from this area by analyzing HVSI/II sequences and RFLP typing 64 Maure from Mauritania and 124 samples from different ethnolinguistic groups from Mali. This information, together with the available data from the aforementioned West African countries, has been used to assess the relative importance that physical and cultural factors such as geography and language have had on the genetic structure of the region.

Material and Methods

Samples

Buccal swabs or blood samples were obtained from 188 unrelated west African individuals, 64 Maure from Mauritania, including 30 previously published (Rando et al. 1988) that have been RFLP analyzed and reclassified (Table 1), and 124 subjects from different ethnolinguistic groups in Mali (Table 2). Appropriate informed consent to anonymously use their data was obtained from all the individuals sampled. The following available samples from other west African countries (Figure 1 and Appendix I) have also been used for population comparisons: Senegal 240 (Graven et al. 1995; Rando et al. 1998); Guinea Bissau 372 (Rosa et al. 2004); Sierra Leone 277 (Jackson et al. 2005); Niger-Nigeria 160 (Watson et al. 1997); Cameroon 441 (Destro-Bisol et al. 2004; Coia et al. 2005).

Table 1.  HVI and HVII sequences and RFLPs in Mauritanians
HVIaHVII73477991111b
13500001225
s4725570389
8755200010
dqaanl1804
ngku
  1. aHVI nucleotide positions are −16000

  2. bHaplotypes only present in Mauritania are asterisk marked

  3. Restriction sites are indicated as follows: a = Alu I, d = Nde II, g = Hinf I, k = Rsa I, l = Taq I, n= Hae II, s = Sph I, u = Mse I

CRS (H/HV/U/R)
 CRS +  1
 CRS 1
 CRS  2
 519263 311.1  1
H/HV/U/R – CRS
 172  1
 207  1
 291  1
 145 222   1
 174 311  1
 234 235 +  1*
 278 311  1
preHV
 126 362 + 1
 126 304 362 +  1
V
 298 +  +1
 153 298 + + 1
U5
 270 1
U5b
 189 270 1
 189 192 27073 150 152 263 303.1 311.1 + + 2
U6a
 172 219 278 + 6
 092 172 219 278 + 1
 172 219 278 290 311 + 1
U6a1
 172 189 219 278 + 3
K
 224 311  1
 093 224 311  2
J
 069 126 225 1*
 069 126 295 1*
L3b
 124 223 278 362 51973 263 311.1 514dCA + 1
 111 124 223 278 362 1*
L3b1
 051 223 265C 278 318 362 1*
L3e4
 051 153 223 264 1*
L1b
 126 187 189 223 239 270 278 311 1*
L1b1
 126 187 189 223 264 278 293 311 1
 187 189 223 264 270 278 293 311 1
 126 187 189 223 264 270 278 293 311 73 152 182 185T 195 247 4
 263 311.1 354 514dCA 
 709 769 825A 
 086 126 187 189 223 264 270 278 293 311 3
 093 126 187 189 223 264 270 278 293 311 1
 126 145 187 189 223 264 270 278 293 311 1
L1c*
129 189 223 278 294 311 360 51973 151 152 182 186A 189C 247 + + + 2*
  263 311.1 316 514dCA 
  769 825A 
L2c
 223 278 390 51973 150 152 182 195 198 263 1
  303.1 311.1 325 513 
  514dCA 680 709 769 
L2a1
 223 278 294 309 390 2
 051 223 278 294 309 390 1
 189 223 278 294 309 390 1
 051 223 278 294 305T 309 390 1*
 129 223 278 294 309 342 390 1*
L2a1beta3
 189 192 223 278 294 309 390 2
L2b1
 086 114A 129 213 223 278 362 390 1*
Total 64
Table 2.  HVI and HVII sequences and RFLPs in Mali
HVIa,cHVIIMalBobBabDogTuaSonPeuSenSok?Malib
  1. aHVI nucleotide positions are −16000.

  2. bHaplotypes only present in Mali are asterisk marked.

  3. cPositions underlined are missing in the halpogroup.

  4. Mal = Malinke, Bob = Bobo, Bab = Bambara, Dog = Dogon, Tua = Tuareg, Son = Sonrhai, Peu = Peul, Sen = Senoufo, Sok = Soninke,?= unknown ethnic assignation.

HV (+7025AluI, −14766MseI)
 15898 222 311263 303.1 311.1 509 750 1 1*
L3b (15940dT)
 15940dT 124 223 278 36273 249iA 263 311.1 514dCA 750 1 1
 15940dT 124 223 278 362 51973 263 311.1 514dCA 750 1 1
 15940dT 124 223 278 362 51973 151 152 263 311.1 514dCA 544G 7501 1
 15940dT 124 278 355 362 52773 263 311.1 514dCA 750 1 1*
 15940dT 093 124 223 278 362 51973 263 303.1 311.1 514dCA 750 1 1
 15940dT 093 124 223 278 362 52773 263 311.1 514dCA 750 1 1
 15940dT 051 124 183C 189 223 278 36273 152 263 311.1 514dCA 7501 1
 15940dT 124 209 223 278 362 38473 263 303.1 311.1 514dCA 750 1 1*
 15940dT 124 223 278 291 362 51973 195 263 311.1 514dCA 750 1 1
 15916 15940dT 124 223 278 355 362 52773 263 311.1 514dCA 750 1 1
 15940dT 124 145 182C 183C 189 223 278 36273 263 311.1 514dCA 750 1 1
 15940dT 124 156 176 223 278 362 51973 263 311.1 514dCA 750 1 1*
 15940dT 093 124 182C 183C 189 223 270 278 294 36273 263 303.1 311.1 514dCA 750 1 1*
L3b1
 15940dT 223 278 36273 263 303.1 311.1 514dCA 750 1 1
 15940dT 223 278 362 51935 73 263 311.1 514dCA 750 1 1
 15940dT 223 278 362 51973 263 311.1 514dCA 750 1 1
 15940dT 093 223 278 362 51973 263 311.1 514dCA 750 1 1
 15940dT 223 234 278 362 51973 263 311.1 514dCA 750 1 1 2*
 15940dT 051 223 278 318 362 51973 261 263 311.1 514dCA 750 1 1
 15940dT 093 223 234 278 362 51973 263 311.1 514dCA 7501 1*
L3d (−8616MboI, −10084TaqI, 921)
 124 22373 152 263 311.1 514dCA 709 750 921 1 1
 124 223 39973 150 263 311.1 514dCA 750 851 9211 1
 124 166 22373 152 263 311.1 514dCA 750 921 1 1
 124 183C 189 223 39973 150 152 263 311.1 514dCA 750 9211 1
 124 223 286A 51973 150 152 263 303.1 311.1 514dCA 750 921 1 1*
 124 223 325 39973 150 152 263 303.1 311.1 514dCA 750 921 1 1*
L3d2
 124 223 25673 146 263 303.1 311.1 514dCA 574 750 921 1 1
 124 223 25673 146 152 263 311.1 514dCA 750 921 1 1
 124 223 256 35573 146 152 263 311.1 514dCA 750 9211 1*
L3d3
 124 183C 189 223 278 304 31173 151 152 263 303.1 311.1 750 921 1 1
L3e1(15942, 189, 200)
 15942 172 223 327 39973 150 189 200 263 311.1 750 1 1
L3e2 (+2349MboI)
 15927C 223 320 51973 150 195 263 303.1 311.1 7501 1 2
 15924 223 320 51973 150 189 195 198 263 311.1 750 1 1
 148 223 320 51973 150 195 198 263 311.1 750 1 1*
 184iC 189 192 278 32073 150 263 311.1 750 1 1*
L3e2b (+2349MboI)
 172 189 223 320 51960 61 64 73 150 195 263 311.1 514dCA 620dT 750 1 1
 172 183C 189 223 320 51973 150 152 195 263 311.1 750 1 1
 172 183C 189 223 320 51973 150 152 195 263 311.1 514dCA 750 1 1
 172 183C 189 223 320 51973 150 195 263 303.1 311.1 750 1 1
 172 183C 189 223 320 51973 150 195 263 311.1 514dCA 750 1 1
 172 183C 184iC 187 189 223 320 51973 114 150 195 263 303.1 311.1 750 1 1
 172 183C 189 223 259 320 51973 150 195 263 303.1 311.1 750 1 1*
L3e3 (+2349MboI,+5260AvaII, 195, 750)
 214 223 265T 31173 150 195 263 303.1 311.1 514dCA 1 1*
L3e4 (+2349MboI, +5260AvaII, 750)
 051 223 264 51973 150 263 311.1 514dCA1 1
L3e5 (+2349MboI, 398)
 041 223 355 51973 150 263 311.1 398 514dCA 750 1 1
L3f (15940dT, 189, 200)
 15940dT 129 209 223 311 51973 189 200 263 303.1 311.1 750 1 1
L3f1
 15940dT 209 223 292 311 51973 150 189 200 263 750 761 1 1
L0a* (73, 93, 189, 236)
 129 148 168 172 187 188G 189 223 230 293 311 320 51964 93 152 185 189 236 247 263 311.1 514dCA 574 750 769 825A 1 1
L0a1
 129 148 168 172 187 188G 189 223 230 311 32064 93 185 189 195 200 236 247 263 311.1 514dCA 750 769 825A 1 1
 129 148 168 172 187 188G 189 223 230 241 311 320 51964 93 152 185 189 200 236 247 263 303.1 311.1 514dCA 750 769 825A1 1*
L1b (−7055AluI, 185T, 357, 709, 710)
 126 187 189 223 264 270 278 311 51973 150 152 182 185T 189 195 247 263 311.1 357 514dCA 709 710 750 769 825A1 1
 126 187 189 223 264 270 278 311 51973 152 182 185T 195 247 263 311.1 357 514dCA 709 710 750 769 825A 1 1
 126 187 189 223 264 270 278 311 51973 152 182 185T 189 195 247 263 303.1 311.1 357 514dCA 709 710 750 769 825A1 1
 111 126 187 189 223 239 270 278 31173 146 152 263 303.1 311.1 357 514dCA 709 710 750 769 825A 11
 126 187 189 223 239 264 270 278 311 51973 152 182 185T 189 247 263 311.1 357 514dCA 709 710 750 769 825A 1 1
L1b1 (−7055AluI, 185T, 357, 709, 710)
 126 187 189 223 264 270 278 293 31173 151 152 182 185T 189 195 247 263 311.1 357 514dCA 709 710 750 769 825A 1 1
 126 187 189 223 264 270 278 293 311 39973 151 152 182 185T 189 195 247 263 311.1 357 514dCA 709 710 750 769 825A1 1
 126 187 189 223 264 270 278 293 311 51973 94 152 182 185T 195 247 263 311.1 357 514dCA 709 710 750 769 825A 11
 126 187 189 223 264 270 278 293 311 51973 146 152 182 185T 189 195 247 263 303.1 311.1 357 514dCA 709 710 750 769 825A1 1
 126 187 189 223 264 270 278 293 311 51973 152 182 185T 195 247 263 286dA 311.1 357 514dCA 709 710 750 769 825A 1 1
 126 187 189 223 264 270 278 293 311 51973 152 182 185T 247 263 303.1 311.1 357 514dCA 709 710 723 750 769 825A 1 1
 093 126 187 189 223 264 270 278 293 311 51973 152 182 185T 195 247 263 311.1 357 514dCA 709 710 750 769 825A 1 1
 114G 126 187 189 223 264 270 278 293 311 51973 152 182 185T 195 247 263 303.1 311.1 357 514dCA 709 710 750 769 825A 1 1
 126 187 189 223 264 270 278 293 311 318 51973 152 182 185T 195 247 263 311.1 357 514dCA 709 710 750 769 825A1 1*
L1c1a (151, 186A, 189C, 297, 316)
 15941 093 129 187 189 223 261 274 278 311 360 51973 151 152 182 186A 189C 247 263 291T 297 311.1 316 514dCA 750 769 825A 1 1*
L1c3a1 (15905, 15978, 151, 186A, 189C, 316)
 15905 15978 129 183C 189 215 223 278 294 311 360 51973 151 152 182 186A 189C 247 263 311.1 316 514dCA 516 750 769 825A 1 1
 15905 15978 093 129 183C 189 215 223 278 294 311 360 51973 151 152 182 186A 189C 247 263 297T 311.1 316 514dCA 750 769 772 825A 1 1
L1c3b1 (15905, 15978, 186A, 189C, 316, 629)
 15905 15978 017 129 163 187 189 223 278 293 294 311 360 51973 152 182 186A 189C 247 263 311.1 316 514dCA 629 750 769 825A 1 1
 15905 15978 15998 017 129 163 187 189 223 278 293 294 311 360 51973 152 182 186A 189C 247 263 311.1 316 514dCA 629 750 769 825A 1 1
L2a
 223 278 294 390 51973 146 152 195 263 311.1 750 769 1 1
L2a1
 278 294 309 390 51973 146 152 195 263 311.1 750 769 1 1*
 223 278 294 309 39073 146 152 195 263 303.1 311.1 514dCA 750 7691 1
 223 278 294 309 39073 143 146 152 195 198 263 311.1 670 750 7691 1
 223 278 294 309 39073 146 152 195 200 263 311.1 514dCA 670 750 769 1 1
 223 278 294 309 390 51973 146 152 195 263 311.1 750 769 1 1
 223 278 294 309 390 51973 146 152 195 263 303.1 311.1 750 7691 1
 223 278 294 309 390 51973 146 152 195 263 311.1 513 750 7691 1
 223 278 294 309 390 51973 146 152 195 248dA 263 303.1 311.1 750 769 1 1
 042 223 278 294 309 390 51973 146 152 195 263 311.1 670 750 769 1 1
 223 278 294 309 368 390 51973 146 152 195 263 311.1 509 514iiCA 750 7691 1
 223 278 294 309 390 427 51973 146 152 195 263 303.1 311.1 506 750 769 1 1
 129 223 278 294 309 390 51973 146 152 195 263 311.1 670 750 769 1 1
 183C 189 223 278 294 309 39073 143 146 152 195 263 311.1 514dCA 750 7691 1
 189 223 278 294 309 390 51973 146 152 195 263 303.1 311.1 750 769 1 1
 189 223 278 294 309 390 51973 146 152 195 263 311.1 750 7691 1
 223 278 292 294 309 368 390 51973 146 150 152 182 195 263 311.1 514iiCA 750 769 1 1
 086 183C 189 223 269 278 294 309 39073 146 150 152 195 263 311.1 731 750 769 1 1*
 15993 189 223 270 278 292 294 309 356 39073 143 146 152 195 263 303.1 311.1 534 750 1 1*
L2a1a
 223 278 286 294 309 390 51973 146 152 195 263 311.1 750 769 1 1
 223 278 286 294 309 390 51973 146 152 195 263 311.1 514dCA 750 769 1 1
L2a1beta3
 189 192 223 278 294 309 39073 143 146 152 195 263 303.1 311.1 750 769 11
 189 192 223 278 294 309 39073 143 146 152 195 263 311.1 534 750 769 1 1
 189 192 223 278 294 309 390 51973 146 263 311.1 534 750 769 1 1
 189 190iC 192 223 278 293 294 309 39073 146 152 195 263 311.1 514dCA 750 7691 1*
L2a1d
 15924 213 223 278 291 294 309 390 51973 143 146 152 195 263 303dC 311.1 750 769 1 1*
L2b (150, 182, 198, 204)
 093 114A 129 213 223 278 39073 146 150 152 182 195 198 204 263 311.1 709 750 769 867 1 1
 114A 129 212 213 223 278 39073 146 150 152 182 195 198 204 263 311.1 514dCA 709 750 7691 1*
L2b1 (418)
 114A 129 213 223 278 362 39073 146 150 152 182 195 198 204 263 311.1 418 514dCA 750 769 1 1
 114A 129 213 223 278 362 39073 146 150 152 182 195 198 204 263 311.1 418 514dCA 750 769 1 1
 114A 129 213 223 278 355 362 39073 150 152 182 195 198 204 263 311.1 418 514dCA 750 769 11
L2b2 (15940iT, 207)
 15897 15940iT 15947 114A 129 213 223 278 355 390 51973 146 150 152 182 185C 195 198 204A 207 263 311.1 514dCA 750 7691 1*
L2c (93, 325, 680, 709)
 223 27873 93 146 150 152 182 198 263 311.1 325 514dCA 680 709 750 769 1 1
 223 278 39073 93 146 150 152 182 195 198 263 311.1 325 514dCA 680 709 750 7691 1
 223 278 39073 146 150 152 182 195 199 263 311.1 325 680 709 750 7691 1
 223 278 39073 93 146 150 152 195 198 263 303.1 311.1 325 513 514dCA 680 709 750 7691 1
 093 192 278 39073 93 146 150 152 182 195 198 263 311.1 325 514dCA 680 709 750 769 1 1*
 051 223 261 278 51973 93 146 150 152 182 195 198 263 303.1 311.1 325 514dCA 680 709 750 769 1 1
 223 278 293 390 51973 93 150 152 182 195 198 263 311.1 325 499 514dCA 680 709 750 769 1 1*
 223 278 311 39073 93 146 150 152 182 195 198 263 292 311.1 325 514dCA 680 709 750 769 1 1
 223 278 355 390 51973 93 146 150 152 182 195 198 263 311.1 325 514dCA 680 709 750 769 1 1*
 126 223 274 278 39073 89 93 146 150 152 182 195 198 263 311.1 325 514dCA 680 709 750 769 798 886 1 1*
 15947 172 223 278 362 39073 93 146 150 152 182 195 198 263 311.1 325 680 709 750 769 1 1*
 177 223 241 278 390 51973 146 150 152 263 303.1 311.1 325 514dCA 680 709 750 769 1 1*
 15883 093 189 223 278 355 39073 93 146 150 152 182 195 198 263 311.1 325 514dCA 680 709 750 7691 1*
 164 223 256 278 294 39073 93 146 150 152 182 195 263 311.1 325 513 514dCA 680 709 750 769 1 1*
L2c1
 189 223 278 318 39073 89 93 146 150 152 182 195 198 263 303.1 311.1 325 514dCA 680 709 750 7691 1*
L2c2
 192 223 264 278 39073 93 146 150 152 182 195 198 263 311.1 325 514dCA 680 709 750 7691 1*
 093 126 223 264 274 278 39073 93 146 150 182 195 263 311.1 325 508 514dCA 680 709 750 7691 1
 093 126 223 264 274 278 39073 93 146 150 152 182 195 263 311.1 325 514dCA 680 709 750 769 1 1
 126 188 209 223 264 278 39073 93 146 150 152 182 195 198 263 311.1 325 514dCA 680 709 750 769 1 1*
L2d1 (456, 870)
 129 183C 189 274 278 300 354 390 399 51973 146 150 195 263 303.1 311.1 456 750 769 870 1 1*
Total 3155261615134124
Figure 1.

Map of West Africa showing the origin of the ethnolinguistic samples analyzed in the present study. Numbers indicate the region from which the analyzed samples derive: 1. Mau from Mauritania; 2. Wol, Ser, Bab and Mal from Senegal; 3. ESS, Bal, Dio and Mae from Senegal and Guinea Bissau; 4. Nal, Pap, Bij and Ful from Guinea Bissau; 5. Peu from Senegal, Guinea Bissau and Mali; 6. Lim, Tem, Men and Lok from Sierra Leone; 7. Bab and Mal from Mali; 8. Yor from Nigeria; 9. Tua from Mali and Niger; 10. Hau from Niger; 11. Ful from Niger-Nigeria; 12. Ful, Maa, Pod, Uld, Dab, Fal, Tup and Tal from North Cameroon: 13. Bas, Bam, Bak and Ewo from South Cameroon. Population codes are as defined in Table 5 and Appendix I.

Sequencing of mtDNA

For Maure total DNA, either from buccal swabs or blood samples, was isolated following a fast alkaline method (Rudbeck & Dissing, 1998). For Mali, blood samples were obtained from random blood donors who identified their ethnicity. DNA was extracted using the Gentra PureGene kit (Minneapolis, MN). Hypervariable segment I (HVSI) and Hypervariable segment II (HVSII) of the mitochondrial control region were amplified by PCR using primer pairs HV1 (L15840: 5′ ACTTCACAACAATCCTAATCCT 3′)/HV2(H16436: 5′ CggAgCgAggAgAgTAgCAC 3′) and L16340/H945 (Maca-Meyer et al. 2001), respectively. Amplified products were sequenced for both complementary strands with the Big Dye Terminator Cycle sequencing kit (Applied Biosystems). Sequencing reactions were analyzed on an Applied Biosystems 3100 DNA analyzer.

Haplotype Classification and RFLP Typing

Sequences were aligned with the Cambridge Reference Sequence (CRS) using CLUSTAL and mutations identified by the three last digits of their positions in the reference sequence (Anderson et al. 1981). For transversions the variant base is specified by an additional letter. The haplotypes obtained were sorted into mtDNA haplogroups following, with minor modifications, the nomenclature proposed in Salas et al. (2002) and actualized in Kivisild et al. (2004). In cases of ambiguous sequence assignation additional RFLP analyses were carried out, as detailed in Tables 1 and 2. To homogenize the data the other published sequences used in the genetic analysis were reclassified into haplogroups using the same criteria.

Genetic Analysis

For genetic comparisons with other published data only HVSI positions from 069 to 365 were used. Length variations were not taken into account except for the rare 323d that was used to classify the L3e1b subgroup. Where necessary samples from different publications were pooled by linguistic-ethnic affinities following the published classification in Ethnologue (http://www.ethnologue.com/family_index.asp). However, those ethnolinguistic groups with sample sizes less than 20 individuals were omitted from the analyses. Genetic variation was apportioned by geographic, macrolinguistic and ethnolinguistic criteria using AMOVA by means of ARLEQUIN2000. Genetic distances between populations, based on pairwise FST, were also calculated as implemented in ARLEQUIN2000. Principal component (PC) analyses were performed using SPSS. A database of published African HSVI and HSVI/HSVII sequences and available RFLP coding information was compiled to be used for mismatch analyses.

Results

Minor Additions and Modifications to the African Haplogroup Nomenclature

Based on published (Ingman et al. 2000; Maca-Meyer et al. 2001; Torroni et al. 2001; Herrnstadt et al. 2002; Mishmar et al. 2003) and unpublished African complete mtDNA sequences we have added several HVSII motifs as diagnostic markers for haplogroup assignation of haplotypes obtained by sequencing HVSI and HSVII regions (Table 2). In addition we have renamed L1c3 as L1c3a1, and defined a new subcluster L1c3b1 characterized by additional transitions (017, 163) in the HVI region and 629 in the HVII region. For phylogenetic considerations we have also renamed the L2a1c cluster (Kivisild et al. 2004) as L2a2, keeping the Salas et al. (2002) nomenclature for the L2a1 branches, and named as L2a3 the L2a haplotypes with the 16291 mutation. Two new L2a1 subhaplogroups, respectively defined by the 213 and 234 HVSI positions, have been identified as L2a1d and L2a1e. L2b2 was the name given to the haplotypes belonging to L2b, without 16362 and 418, but with 15940iT and 207 mutations. We have renamed as L3e5 the L3 clade defined by mutation 041 in the HVI region (Fadhlaoui-Zid et al. 2004) as it has the L3e diagnostic +2349 MboI RFLP. Finally, we keep the Salas et al. (2002) nomenclature for the L1a and L1a1 branches, now named L0a and L0a1 following Mishmar et al. (2003). L0a1a has been defined by the mutated positions 129 148 168 172 187 188G 189 223 230 278 293 311 320 in the HVI region. The complete topologies for L1c and L2 haplogroups, including the new subgroups, are showed in Figures 2a and 2b, respectively

Figure 2.

The skeleton of major branching nodes in the L1c and L2 haplogroups, based on HVI sequences. Underlined positions belong to HVII or coding regions.

Classification of Mauritanian Sequences

A total of 42 different haplotypes was observed in 64 Maure. A mismatch search detected 11 haplotypes (26%) which, until now, have been found only in Maure. About 55% of the sequences had an Euroasiatic provenance. The specific North African cluster U6 (Rando et al. 1998; Maca-Meyer et al. 2003) represents 17% of the total, being one of the most prevalent in the area. The rest of the lineages (45%) belonged to sub-Saharan Africa haplogroups. Eight of them (42%) are unique, six (32%) widely spread in Africa, and two (12%) confined to West Africa. The high frequency of L1b/1 lineages (19%) in Maure, at the top edge of other western sub-Saharan African countries, is outstanding. The L2a1 clade is also well represented (9%), but L3 African lineages only account for 6% of the total (Table 1).

Classification of Mali Sequences

A total of 84 different haplotypes were observed in 124 Mali subjects. Only the Bambara and Malinke had large enough sample sizes to be studied independently (Table 2). A mismatch search detected 36 haplotypes (43%) found only in Mali, of which 53% belong to the L2 cluster. In contrast to Mauritania, while only eight lineages (17%) are widely spread in Africa, fourteen (30%) are restricted to West Africa. Again, in contrast to Maure, only one Bambara Euroasiatic sequence (Table 2) was detected in Mali. Although it shares the HVI 222 transition with other North African sequences belonging to haplogroup H (Rando et al. 1998), the RFLP analysis placed it in the basal HV cluster. Surprisingly for a western Africa country, around 42% of the sub-Saharan African sequences were L3 lineages. The predominant haplogroups belonged to L3b (17%), L3e (13%) and L3d (8%), but with different distribution within ethnolinguistc groups. Whereas the Bambara have higher frequencies of L3b (21%) and L3e (10%), L3d (10%) is similar in both samples. One lineage of the L3e5 subgroup, considered to be restricted to North Africans (Fadhlaoui-Zid et al. 2004), was detected in a Peul from Mali (Table 2). Congruent with its geographic position, L2 lineages accounted for 40% of the total in Mali, with the subclades L2a1/a (16%) and L2c (11%) being the best represented. These subclades have higher frequencies in the Malinke (22% and 9%) than in the Bambara (17% and 2%). Representatives of the eastern Africa clades L0a and L0a1 were detected in Mali once and twice, respectively, but the western Africa L1b/1 subclades were the best represented (11%).

West Africa Genetic Differentiation

Several statistics were used to compare the haplogroup frequencies observed in Mauritania and Mali to those available from other western Africa countries. Initially we grouped samples by geography (Appendix II) and performed an AMOVA analysis that only assigned 1.2% of the total variance to differences among countries. However, this differentiation was highly significant (P < 0.000). FST based pair-wise comparisons (Table 3) showed that all countries are significantly different from each other. Their relative relationships are accurately reflected by the first two principal components of a PC analysis (Figure 3). The first principal component (PC1) that accounts for 35% of the variance splits Cameroon from the rest of the countries, with Niger-Nigeria the closest to it. The L0 and L1c and the bulk of the L2a and L3 lineages are the main reasons for this split. The second principal component (PC2) that extracted 25% of the variance clearly separates Mauritania from the rest of the western countries, due to its high frequency of Euroasiatic lineages, leaving Mali closer to Senegal-Guinea-Sierra Leone than to Niger-Nigeria. These results demonstrate that the genetic variation is geographically structured. The Mauritania-Senegal and Mali border seems to be an important barrier to southwards gene flow of the North African Euroasiatic haplogroups to Sub-Saharan Africa. However, the large genetic differentiation between Mauritania and Mali, both sharing an extensive border in the Sahara area, needs explanation. On one hand this border largely occupies a desert area, so gene flow in this region is infrequent. On the other, the fact that the Mali samples were collected in Bamako (southern Mali) could have biased the data to the southern regions, decreasing the Euroasiatic haplogroups frequencies. The latter explanation is reinforced by the comparatively high presence of Euroasiatic lineages in the Niger-Nigeria sample, where the northern nomadic Tuareg and the pastoral Hausa are well represented (Watson et al. 1997). In fact, this Niger-Nigeria Euroasiatic component slightly likens this region to Mauritania.

Table 3.  Country pairwise FSTs
 MauSenGuBMalSiLNiN
  1. *= p < 0.05 **= p < 0.01 ***= p < 0.001.

  2. Country abbreviations as in Figure 3.

Sen0.0297*** 
GuB0.0334***0.0023* 
Mal0.0336***0.0062*0.0053* 
SiL0.0345***0.0079***0.0060***0.0072** 
NiN0.0210***0.0139***0.0128***0.0074**0.0048* 
Cam0.0313***0.0207***0.0141***0.0115***0.0098***0.0047**
Figure 3.

PC analysis based on haplogroup frequencies of West African countries. Mau = Mauritania, Sen = Senegal, GuB = Guinea Bisseau, Mal = Mali, SiL= Sierra Leone, NiN = Niger-Nigeria, Cam = Cameroon.

In a second approach samples, excluding the Afroasiatic Maure due to the particular haplogroup composition, were assigned to one of two African macro-linguistic phyla: Afroasiatic or Niger-Congo. Within the latter, sample sizes allowed for further subdivisions into Atlantic, Mande and Volta-Congo branches. Moreover, excepting the Bijagó from Guinea, and Limba and Temne from Sierra Leone, all the Atlantic samples belong to the Northern subdivision (Appendix III). The AMOVA shows that the genetic diversity is also linguistically structured (0.8% of the total variance is due to genetic diversity among linguistic divisions, P < 0.000). FST based pair-wise comparisons (Table 4) show that: 1) there is no significant difference between the Atlantic branches; 2) Mande is closest to Atlantic, 3) Volta-Congo and Afroasiatic are different from each other and from the Atlantic-Mande cluster. PC analysis was highly congruent with the FST results (Figure 4). Whereas PC1, accounting for 45.7% of the variance, splits the Volta-Congo group, the PC2 that accounted for 29.3% of the variance isolates the Afroasiatic group from the Atlantic-Mande cluster. L3e lineages, excepting L3e4, the bulk of L1c (less L1c*) and L0a (less L0a1), and the exclusive presence of representatives of the eastern Africa L4g and L5 haplogroups (Kivisild et al. 2004), are mainly responsible for the Volta-Congo split. On the other hand, the relative abundance of L3 lineages and the higher frequencies of L2a3, L1c3a1 and L0a1a largely pull Afroasiatics away from the Atlantic-Mande group. Although apparently there is a strong genetic differentiation among linguistic divisions, it is difficult to distinguish between a geographic and a linguistic component.

Table 4.  Linguistic groups pairwise FSTs
 AtNAtlManVoC
  1. *= p < 0.05 **= p < 0.01 ***= p < 0.001.

  2. AtN = Atlantic Northern, Atl = Atlantic, Man = Mande, VoC = Volta-Congo, AfA = Afro-Asiatic.

  3. Ethnolinguistic groups included in each linguistic family are indicated in Appendix I.

Atl0.0017 
Man0.0026**0.0019 
VoC0.0112***0.0092***0.0185*** 
AfA0.0102***0.0085***0.0184***0.0048**
Figure 4.

PC analysis based on haplogroup frequencies of West African macrolinguistic groups. Abbreviations as in Table 4.

Ethnolinguistic Genetic Differentiation

To test the importance of geographic versus linguistic components in the genetic differentiation we compared samples grouped by ethnolinguistic assignation without taking into consideration political boundaries, with the only restriction that their total sample sizes were not less than 20 individuals (Appendix IV). AMOVA analysis assigned 1.9% (P < 0.001) of the genetic variance to differences among samples. In this case some discrepancies between the P-values assigned to the FST based pair-wise comparisons (Table 5) and the relative distances plotted in the PC graphic (Figure 5) were evident. Small sample sizes, global versus binary relationships and the fact that, in this case, the two first PCs only captured 21,2% of the total variance could be, separately or together, responsible for these heterogeneous results. In spite of these inaccuracies the picture reflected by the PC analysis is very interesting. Bakaka, Ewondo, Bamileke and Bassa are well separated from the rest by PC1, the relatively high frequency of L1 and L0 and the low frequency of L2b and L2c lineages being mainly responsible for this split. Geographically, these four populations belong to South Cameroon and have a Volta-Congo linguistic affiliation. However, not all Volta-Congo speaking samples cluster together. In fact, the northern Cameroon Volta-Congo Fali, Tali and Tupuri form a different cluster with the Afroasiatic speaking Ouldeme, Podokwo, Daba and Mandara, all of whom inhabit northern Cameroon. In turn two Afroasiatic speakers, the Tuareg and Hausa, and the Volta-Congo speaking Yoruba occupy a central position nearer to the Atlantic-Mande speakers than to their linguistic counterparts. It is worth mentioning that these three samples are from Niger-Nigeria, and therefore geographically closer to the western Atlantic-Mande area than to the Afroasiatic or Volta-Congo from Cameroon. The samples from the Atlantic and Mande speaking groups form a rather homogeneous cluster; nevertheless, some differences are still detectable. First, the Bambara from Mali do not cluster with the other Mande speakers, including the Malinke who are also from Mali. The high frequency of L3b and presence of L0a lineages pulls the Bambara towards the eastern Hausa and Bassa. In addition, the Bambara's very low frequencies for L2b/1 and L2c1/2 pull them away from other Mande and Atlantic speaking groups. Second, the four samples from Sierra Leone apparently cluster by linguistic affinities as Mende, and Loko of Mande and Temne and Limba of Atlantic adscription form different pairs. However, geographic affinities cannot be discarded as the first pair live in northern areas of Sierra Leone and the second in the south. Third, samples from Senegal and Guinea Bissau also seem to be genetically related by geography. Finally, we performed an AMOVA analysis taking into account both macrolinguistic divisions (groups) and the ethnolinguistic samples within them (populations). Only 0.54% of the total variance was assigned to differences among groups, and 1.42% to differences among populations within groups. So the overwhelming majority of these results point to genetic differentiation in West Africa driven more by geographic distance than by linguistic affiliation.

Table 5.  Ethnolinguistic groups pairwise FSTs
 ESSPeuFulWolSerNalBalPapDioBijLimTemMaeBabMalMen
  1. Ethnolinguistic group abbreviations: ESS = Eastern Senegal Speakers, Peu = Peul, Ful = Fulbe, Wol = Wolof, Ser = Serere, Nal = Nalú, Bal = Balante, Pap = Papel, Dio = Diola, Bij = Bijagó, Lim = Limba, Tem = Temne; Mae = Mandenka, Bab = Bambara, Mal = Malinke, Men = Mende, Lok = Loko, Yor = Yoruba, Bas = Bassa, Bam = Bamileke, Bak = Bakaka, Ewo = Ewondo, Fal = Fali, Tup = Tupuri, Tal = Tali, Tua = Tuareg, Hau = Hausa, Maa = Mandara, Pod = Podokwo, Uld = Ouldeme, Dab = Daba.

  2. *= p < 0.05 **= p < 0.01 ***= p < 0.001.

Peu0.0054 
Ful0.0154*0.0000 
Wol0.0192*0.0179*0.0090* 
Ser0.01730.01200.01370.0000 
Nal0.0221*0.0265*0.0237**0.0232*0.0248 
Bal0.01180.0241**0.0160**0.00620.01300.0086 
Pap0.00930.00670.0160***0.0146*0.01610.00000.0126* 
Dio0.01810.01180.0165*0.0181*0.0281*0.00650.0241**0.0075 
Bij0.00060.00000.00470.01320.00310.01150.00120.00810.0131 
Lim0.00000.01070.0178***0.0175**0.0236*0.01070.0118*0.00440.0144*0.0196* 
Tem0.01360.0130*0.0077**0.0173**0.0298**0.00510.0093*0.0077*0.0158*0.01180.0097* 
Mae0.0275**0.01070.0175***0.0230**0.0325**0.0172*0.0197**0.0083*0.0245**0.00450.0243***0.0174*** 
Bab0.00420.00640.0100*0.0131*0.0226*0.0210*0.0117*0.00930.0157*0.01420.00000.0121*0.0293*** 
Mal0.00800.01420.0144*0.00000.00090.01730.00000.00560.0232*0.00400.00100.0153*0.0152*0.0000 
Men0.0196*0.00840.00210.00000.00880.0322**0.00870.0169**0.0221*0.00770.0167**0.00360.0153*0.0120*0.0026 
Lok0.0275*0.0373**0.0193*0.0200*0.0314*0.01560.01350.0315**0.0447***0.0370*0.01300.00150.0441***0.0236*0.02110.0110
Yor0.00350.00980.00000.00000.00000.0191*0.00170.0153*0.01070.00310.00560.00110.0320**0.00000.00000.0000
Bas0.0279**0.0384***0.0239***0.0234**0.0285*0.0300**0.0156*0.0268***0.0396***0.0215*0.0286***0.0237***0.0462***0.0160*0.0184*0.0246**
Bam0.01470.0298***0.0182**0.0175*0.0210*0.0190*0.0209**0.0198**0.0223**0.0259**0.0155**0.0145**0.0512***0.00700.0189*0.0227**
Bak0.0263**0.0399***0.0280***0.0248***0.0332***0.0365***0.0220**0.0358***0.0306***0.0323**0.0258***0.0322***0.0496***0.0264***0.0306***0.0314***
Ewo0.0311***0.0400***0.0265***0.0332***0.0409***0.0372***0.0301***0.0379***0.0326***0.0391***0.0306***0.0267***0.0566***0.0248***0.0420***0.0350***
Fal0.0271*0.0361**0.0271***0.0359***0.0467***0.0207*0.0254**0.0244**0.0326**0.0387**0.0138*0.0131*0.0538***0.01170.0300**0.0355***
Tup0.01570.0330**0.0148*0.0180*0.0296*0.0296*0.00810.0313**0.01880.02240.0195*0.0214*0.0445**0.0189*0.01770.0184*
Tal0.00000.01240.00860.01470.02310.0244*0.0243*0.00870.00100.02540.00000.00020.0372**0.00480.02150.0093
Tua0.01390.0291*0.01380.01320.01500.0383**0.0231*0.0231*0.0364**0.0332*0.01270.01070.0410**0.0229*0.02230.0122
Hau0.00510.00080.00000.01630.0272*0.01750.0199*0.01060.00160.00100.00890.00590.0242*0.00000.01370.0101
Maa0.01130.01340.00900.00000.00530.0432***0.0157*0.0267**0.0179*0.0207*0.01030.0180**0.0419***0.00340.01100.0063
Pod0.0286**0.0169*0.0207***0.0352***0.0397**0.0294**0.0338***0.0309**0.0188*0.0244*0.0299***0.0153*0.0509***0.0258**0.0462***0.0263**
Uld0.0230*0.0256*0.00990.01630.02200.01990.0174*0.0269**0.0187*0.0266*0.01430.00890.0425***0.0174*0.0296*0.0167
Dab0.0322*0.02320.0272*0.0435**0.0582**0.0514***0.0480***0.0438**0.0265*0.0510**0.0366**0.0231*0.0737***0.01350.0528**0.0314*
Peu 
Ful 
Wol 
Ser 
Nal 
Bal 
Pap 
Dio 
Bij 
Lim 
Tem 
Mae 
Bab 
Mal 
Men 
Lok 
Yor0.0031 
Bas0.0342**0.0067 
Bam0.0223*0.00000.0026 
Bak0.0418***0.01130.00880.0115* 
Ewo0.0320**0.0156*0.0234***0.00790.0168** 
Fal0.01060.01260.0308***0.01060.0296***0.0192** 
Tup0.0303*0.00110.0229*0.0179*0.0147*0.0255**0.0274* 
Tal0.01330.00000.0242*0.00000.0172*0.01050.00000.0025 
Tua0.00140.00040.0277**0.01100.0220*0.0215*0.00470.01850.0000 
Hau0.0363*0.00000.01350.00000.0186*0.01570.01030.01860.00000.0256 
Maa0.0203*0.00000.0199**0.00840.0186**0.01630.0160*0.00530.00000.00300.0054 
Pod0.0344**0.01090.0308***0.0186**0.0256***0.0270***0.0197*0.0203*0.00340.0269*0.01410.0112 
Uld0.00320.00000.0309**0.01150.0207**0.0157*0.00040.00000.00000.00000.01020.00060.0159 
Dab0.0455**0.00960.0427**0.01900.0410**0.0312**0.02080.0376*0.00000.0448**0.00060.01850.00000.0296* 
Figure 5.

PC analysis based on haplogroup frequencies of West African ethnolinguistic groups. Abbreviations as in Table 5.

Discussion

Mauritania

The augmented sample size in this study has not substantially changed the relative proportions of the Mauritania sub-Saharan African and western Eurasian genetic components found previously (Rando et al. 1998). L1b and L2a1 frequencies clearly delineate Mauritania with the sub-Saharan western African populations, whereas the Eurasian N derivatives pull them to North Africa. So, congruent with its geographic situation, Mauritania is genetically intermediate between both areas. However, the low frequency of L3A lineages (6%) places Maure far from western sub-Saharan countries. Even the latitudinal northern west Saharan countries (Rando et al. 1998 and our unpublished results) have higher L3A frequencies (15%) than Maure. This is in contrast to the northwards decreasing cline distribution of L3A frequencies in northeast Africa: 37% Sudan, 18% Nubia, 9% Egypt (Krings et al. 1999; Stevanovitch et al. 2003). In fact, when linguistically differentiated Berber and Arab populations from Morocco are compared by their L3A frequencies (6% vs 12%; data from Rando et al. 1998; Thomas et al. 2002; Rosa et al. 2004 and our unpublished results), significant differences are found between them (P = 0.007), the Berber frequency being similar to that of the Maure. This L3A difference between Berber and Arab speaking groups is extendable to Argelia (7% vs 15%; P < 0.000; data from Côrte-Real et al. 1996; Rosa et al. 2004 and our unpublished results) and to Tunisia (10% vs 23%; P = 0.028; data from Plaza et al. 2003; Fadhlaoui-Zid et al. 2004). The latter significance level would be increased if only the most genetically isolated Tunisian Berbers from Chenini-Douiret were compared. One can ask if these differences between Berber and Arab speakers could be the result of a demic difussion during the Muslim Arab occupation of North Africa. However, the L3A subclades found in Arab (Di Rienzo & Wilson, 1991; Richards et al. 2000) and Yemen populations (Di Rienzo & Wilson, 1991; Kivisild et al. 2004) seems to have a recent eastern Africa provenance (Richards et al. 2003; Kivisild et al. 2004). Furthermore, the more abundant L3A subclades in Egypt have their closest matches in eastern sub-Saharan Africa and with those from Morocco, Algeria and Tunisia from western and central sub-Saharan Africa. So, it seems that the most probable origin of these L3A northern Africa lineages was the historic Arab slave trade that had less impact on the non-Arab and more isolated Berber groups, including those from Mauritania (Flores et al. 2000). The corollary is that, at least in Maure, the sub-Saharan African L1 and L2 components had an independent and older origin than the L3A lineages.

Mali

Bambara is the largest ethnolinguistic group in Mali and together with the Malinke they account for 50% of the population. Despite the fact that both groups speak related dialects belonging to the Mande branch, Bambara have been found to be genetically less related to the other Mande groups, including the Malinke (Figure 4). In fact, a contingency test shows that this difference is significant (P < 0.01). Comparatively, the Bambara are characterized by higher frequencies of L1c and L3b lineages, whereas the Malinke have a major proportion of L1b and L2c lineages. Except for L1c*, that has its peak in Loko and lacks representation in Bambara and Malinke, L1c lineages are better represented in Cameroon than in the rest of the countries studied. In contrast, L1b and L2c show higher frequencies in the west. This distribution is congruent with the northeastern geographic position that the Bambara occupy with respect to the Malinke. That they contain L0a, L3e1 and L3f lineages reinforces a possible Central African influence shared with Cameroon. As Bambara is also the dominant language in Mali these results suggest a cultural homogenizing effect on a genetically diverse background.

In addition to the Mande other minority ethno-linguistic groups have also been sampled in Mali, albeit with insufficient sample sizes to be treated individually. However the Dogon, a group with imprecise language assignation and presumed to be ethnically homogeneous, deserves comment. Four of the six Dogon subjects analyzed belong to the L2 cluster (0.7), whereas the other two are L3 (0.3). Although nothing could be concluded from these data, in a recent article (Wilder et al. 2004) a sample of 37 Dogon was analyzed for the mitochondrial cytochrome c oxidase 3 gene (MT- CO3. The available information from complete mtDNA sequences has allowed us to infer that the frequencies of that sample are 0.62 for L2 and 0.38 for L3. Both data suggest that possibly one of the genetic characteristics of the Dogon might be an unusually high frequency of L2 lineages.

West Africa

The area occupied by Cameroon is not always considered as part of the geographic region known as West Africa. Taking into account its haplogroup composition it could also be considered a genetic outsider. There are numerous lineages (L0a, L0a1a, L0a2, L2a1e, L4g, and L5) that have a more central-eastern than western assignation (Pereira et al. 2001; Salas et al. 2002; Kivisild et al. 2004). However, favouring a geographic continuum, the L3e haplogroup links Mali and Niger-Nigeria to Cameroon. Only L3e4 which is more abundant in the western countries is an exception. These results are in agreement with the Central African origin proposed for this haplogroup (Bandelt et al. 2001; Salas et al. 2002). Finally, other lineages (L1b, L2, L2b/1, L2c1, and L3b) disclose the closeness of Mali with Senegal, Guinea and Sierra Leone.

Conclusions

The main conclusion deduced from our results is that West African genetic diversity is better structured by geographic than by ethno-linguistic criteria. This fact seems to be extendable to eastern Africa (Pereira et al. 2001) and, perhaps, to the whole continent (Salas et al. 2002). It seems that languages have spread within Africa mainly as a cultural imposition over an already genetically diverse landscape. However, more refined studies and larger sample sizes will be necessary to disentangle geographic from language effects at a more detailed level.

Acknowledgements

We would like to thank Giovanni Destro-Bisol and coauthors for allowing us the use of their unpublished and valuable data. This study was supported by grants BMC2001–3511 from Ministerio de Ciencia y Tecnología and COF2002–015 from Gobierno de Canarias to V.M.C., J.M. Moulds was supported by a grant from the National Institutes of Health (R01 AI42367)

Appendices

Appendix I

Linguistic groupCountryEthnolinguistic groupPopulationSampleReference
  1. Coia, V., Verginelli, F., Boschi, I., Spedini, G., Comas, D., Calafell, F., Battaggia, C. & Destro-Bisol, G. (2004) Reconstructing the peopling of Cameroon through the analysis of mitochondrial DNA AAPA 73rd Annual Meeting Abstracts, Poster. http://www.scienzemfn.uniroma1.it/labantro/database.html

  2. Destro-Bisol, G., Coia, V., Boschi, I., Verginelli, F., Caglià, A., Pascali, V., Spedini, G. & Calafell, F. (2004) The Analysis of variation of mtDNA Hypervariable region 1 suggest that eastern and western Pygmies diverged before the Bantu Expansion. The American Naturalist163(2), 212–226.

  3. Graven, L., Passarino, G., Semino, O., Boursot, P., Santachiara-Benerecetti, S., Langaney, A. & Excoffier, L. (1995) Evolutionary correlation between control region sequence and restriction polymorphisms in the mitochondrial genome of a large Senegalese Mandenka sample. Mol Biol Evol 12:334–345.

  4. Jackson, B. A., Wilson, J. L., Kirbah, S., Sidney, S. S., Rosenberger, J., Bassie, L., Alie, J. A. D., McLean, D. C., Garvey, W. T. & Ely, B. (2005) Mitochondrial DNA Genetic Diversity among four ethnic groups in Sierra Leone. Am J Phys Antropol, in press http://www.bumc.bu.edu/Dept/Content.aspx?DepartmentID=350&PageID=9313.

  5. Rando, J. C., Pinto, F., González, A. M., Hernández, M., Larruga, J. M., Cabrera, V. M. & Bandelt, H.-J. (1998) Mitochondrial DNA analysis of Northwest African populations reveals genetic exanges with European, Near-Eastern, and sub-Saharan populations. Ann Hum Genet62, 531–550.

  6. Rosa, A., Brehm, A., Kivisild, T., Metspalu, E. & Villems, R. (2004) MtDNA profile of west Africa Guineans: Towards a better understanding of the Senegambia region. Ann Hum Genet68, 340–352.

  7. Watson, E., Forster, P., Richards, M. & Bandelt, H.-J. (1997) Mitochondrial footprints of human expansions in Africa. Am J Hum Genet61, 691–704.

Atlantic Northern (AtN)Mauritania (Mau) Mauritania (Mau)30Rando et al. 1998
Mauritania (Mau) Mauritania (Mau)34This study
Senegal (Sen) Speakers(ESS)Eastern SenegalBainouk1Rando et al. 1998
Atlantic Northern (AtN)Guinea Bisseau (GuB)Eastern Senegal Speakers(ESS)Banhu (BAB)1Rosa et al. 2004
Atlantic Northern (AtN)Guinea Bisseau (GuB)Eastern Senegal Speakers(ESS)Cassanga (CCJ)6Rosa et al. 2004
Atlantic Northern (AtN)Guinea Bisseau (GuB)Eastern Senegal Speakers(ESS)Beafada (BIF)19Rosa et al. 2004
Atlantic Northern (AtN)Senegal (Sen)Peul (Peu)Peul7Rando et al. 1998
Atlantic Northern (AtN)Mali (Mal)Peul (Peu)Peul (Peu)15This study
Atlantic Northern (AtN)Guinea Bisseau (GuB)Peul (Peu)Fula-Preto&Forro(FUC)19Rosa et al. 2004
Atlantic Northern (AtN)Guinea Bisseau (GuB)Peul (Peu)Fula-Toranca (FUT)1Rosa et al. 2004
Atlantic Northern (AtN)Guinea Bisseau (GuB)Fulbe (Ful)Fula (FUL)38Rosa et al. 2004
Atlantic Northern (AtN)Guinea Bisseau (GuB)Fulbe (Ful)Fula-Fula (FUF)19Rosa et al. 2004
Atlantic Northern (AtN)Niger-Nigeria (NiN)Fulbe (Ful)Fulbe (WF)60Watson et al. 1997
Atlantic Northern (AtN)Cameroon (Cam)Fulbe (Ful)Fulbe (Ful)34Coia et al. 2004
Atlantic Northern (AtN)Senegal (Sen)Wolof (Wol)Wolof (WOL)48Rando et al. 1998
Atlantic Northern (AtN)Senegal (Sen) Tukulor10Rando et al. 1998
Atlantic Northern (AtN)Senegal (Sen) Lebou2Rando et al. 1998
Atlantic Northern (AtN)Senegal (Sen)Serere (Ser)Serere (SER)23Rando et al. 1998
Atlantic Northern (AtN)Guinea Bisseau (GuB)Nalú (Nal)Nalú (NAJ)26Rosa et al. 2004
Atlantic Northern (AtN)Senegal (Sen)Balanta (Bal)Balante1Rando et al. 1998
Atlantic Northern (AtN)Guinea Bisseau (GuB)Balanta (Bal)Balanta (BLE)62Rosa et al. 2004
Atlantic Northern (AtN)Guinea Bisseau (GuB)Papel (Pap)Mancanha (MAN)19Rosa et al. 2004
Atlantic Northern (AtN)Guinea Bisseau (GuB)Papel (Pap)Manjaco (MFV)27Rosa et al. 2004
Atlantic Northern (AtN)Guinea Bisseau (GuB)Papel (Pap)Papel (PBO)23Rosa et al. 2004
Atlantic Northern (AtN)Senegal (Sen)Diola (Dio)Diola8Rando et al. 1998
Atlantic Northern (AtN)Guinea Bisseau (GuB)Diola (Dio)Baiote (BDA)6Rosa et al. 2004
Atlantic Northern (AtN)Guinea Bisseau (GuB)Diola (Dio)Djola (EJA)18Rosa et al. 2004
Atlantic Northern (AtN)Senegal (Sen) Manyake5Rando et al. 1998
Atlantic Northern (AtN)Guinea Bisseau (GuB) Brame (BRA)8Rosa et al. 2004
Atlantic (Atl)Guinea Bisseau (GuB)Bijagó (Bij)Bijagó (BJG)22Rosa et al. 2004
Atlantic (Atl)Sierra Leone (SiL)Limba (Lim)Limba68Jackson et al. 2005
Atlantic (Atl)Sierra Leone (SiL)Temne (Tem)Temne121Jackson et al. 2005
Atlantic (Atl)Guinea Bisseau (GuB) Landoma (LAN)1Rosa et al. 2004
Atlantic (Atl)Guinea Bisseau (GuB) Mansonca (MSW)18Rosa et al. 2004
Mande (Man)Mali (Mal) Bobo (Bob)5This study
Mande (Man)Senegal (Sen)Mandenka (Mae)Mandenka119Graven et al. 1995
Mande (Man)Guinea Bisseau (GuB)Mandenka (Mae)Jancanca (JAD)1Rosa et al. 2004
Mande (Man)Guinea Bisseau (GuB)Mandenka (Mae)Mandinga (MNK)30Rosa et al. 2004
Mande (Man)Senegal (Sen)Bambara (Bab)Bambara5Rando et al. 1998
Mande (Man)Mali (Mal)Bambara (Bab)Bambara (Bab)52This study
Mande (Man)Senegal (Sen)Malinke (Mal)Malinke1Rando et al. 1998
Mande (Man)Mali (Mal)Malinke (Mal)Malinke (Mal)31This study
Mande (Man)Senegal (Sen) Sarakhole3Rando et al. 1998
Mande (Man)Senegal (Sen) Soninke5Rando et al. 1998
Mande (Man)Mali (Mal) Soninke (Sok)3This study
Mande (Man)Senegal (Sen) Soce1Rando et al. 1998
Mande (Man)Guinea Bisseau (GuB) Sussu (SUD)8Rosa et al. 2004
Mande (Man)Sierra Leone (SiL)Mende (Men)Mende59Jackson et al. 2005
Mande (Man)Sierra Leone (SiL)Loko (Lok)Loko29Jackson et al. 2005
Volta-Congo (VoC)Niger-Nigeria (NiN)Yoruba (Yor)Yoruba (WY)33Watson et al. 1997
Volta-Congo (VoC)Cameroon (Cam)Bassa (Bas)Bassa (Bas)46Coia et al. 2004 and Destro-Bisol et al. 2004
Volta-Congo (VoC)Mali (Mal) Dogon (Dog)6This study
Volta-Congo (VoC)Mali (Mal) Senoufo (Sen)1This study
Volta-Congo (VoC)Cameroon (Cam)Bamileke (Bam)Bamileke (Bam)48Destro-Bisol et al. 2004
Volta-Congo (VoC)Cameroon (Cam)Bakaka (Bak)Bakaka (Bak)50Coia et al. 2004
Volta-Congo (VoC)Cameroon (Cam)Ewondo (Ewo)Ewondo (Ewo)53Destro-Bisol et al. 2004
Volta-Congo (VoC)Cameroon (Cam)Fali (Fal)Fali (Fal)41Coia et al. 2004
Volta-Congo (VoC)Cameroon (Cam)Tupuri (Tup)Tupuri (Tup)25Coia et al. 2004
Volta-Congo (VoC)Cameroon (Cam)Tali (Tal)Tali (Tal)20Coia et al. 2004
Afro-Asiatic (AfA)Mali (Mal)Tuareg (Tua)Tuareg (Tua)1This study
Afro-Asiatic (AfA)Niger-Nigeria (NiN)Tuareg (Tua)Tuareg (WT)23Watson et al. 1997
Afro-Asiatic (AfA)Niger-Nigeria (NiN)Hausa (Hau)Hausa (WH)20Watson, 1997
Afro-Asiatic (AfA)Cameroon (Cam)Mandara (Maa)Mandara (Man)37Coia et al. 2004
Afro-Asiatic (AfA)Cameroon (Cam)Podokwo (Pod)Podokwo (Pod)39Coia et al. 2004
Afro-Asiatic (AfA)Cameroon (Cam)Ouldeme (Uld)Ouldeme (Uld)28Coia et al. 2004
Afro-Asiatic (AfA)Cameroon (Cam)Daba (Dab)Daba (Daba)20Coia et al. 2004
Table Appendix II.  Haplogroup frequencies, in percentages, for the different countries
HaplogroupMauSenGuBMalSiLNiNCam
  1. Country abbreviations as in Figure 3.

H/HV*/U*/R*18.72.1-0.80.45.7-
preHV3.1------
V3.1----0.6-
U51.6------
U5b4.70.82.7--0.60.9
U6a12.5-----0.5
U6a14.70.82.2-1.42.5-
U6b-0.4---0.60.2
K4.7------
J3.1------
N1c-----1.3-
A----0.4--
M1-0.41.1----
N*/M*/L3*----1.41.32.7
L3b3.15.44.810.59.711.37.0
L3b11.64.23.86.51.84.41.6
L3d/d2-7.57.57.39.05.04.3
L3d1--1.3--1.92.7
L3d3-0.80.50.8--0.7
L3e1---0.80.71.32.0
L3e1a------0.9
L3e1b-0.4----1.8
L3e2-1.32.74.01.14.44.8
L3e2b-0.41.65.64.05.05.0
L3e3-0.4-0.8-1.31.4
L3e41.62.53.00.81.8-1.6
L3e5-0.8-0.8--2.0
L3f---0.8-1.31.8
L3f1-1.72.40.84.75.02.7
L3h--3.5-0.70.60.7
L4g------2.9
L21.615.814.011.39.03.12.3
L2a-3.84.00.89.46.36.6
L2a19.47.57.514.57.66.36.8
L2a1a-0.40.51.60.43.1-
L2a1b-0.80.8---0.2
L2a1beta33.12.11.93.21.85.63.4
L2a1d-0.81.60.81.11.30.5
L2a1e------0.2
L2a3-0.8----1.1
L2b-2.52.21.61.1-1.1
L2b11.65.45.63.21.10.60.9
L2c1-3.30.80.80.70.6-
L2c2-1.71.63.22.9-0.5
L2d1--1.10.81.10.61.8
L2d2-1.70.8-3.20.60.9
L5------0.5
L1b1.63.81.34.04.31.9-
L1b117.215.08.97.310.111.96.1
L1c*3.11.32.7-5.11.30.9
L1c1a--0.50.8--1.1
L1c1a1------2.3
L1c2------3.2
L1c3a1-2.11.31.6-0.61.8
L1c3b1-0.40.51.60.70.60.9
L0a---0.8--2.3
L0a1-0.85.11.63.21.32.3
L0a1a-----0.62.3
L0a2------1.8
Sample64240372124277160441
Table Appendix III.  Haplogroup frequencies, in percentages, for the different linguistic groups
HaplogroupAtNAtlManVoCAfA
  1. AtN = Atlantic Northern, Atl = Atlantic, Man = Mande, VoC = Volta-Congo, AfA = Afro-Asiatic.

  2. Ethnolinguistic groups included in each linguistic family are indicated in appendix I.

H/HV*/U*/R*1.20.41.4-2.4
preHV-----
V---0.3-
U5-----
U5b3.4----
U6a----1.2
U6a11.60.91.4-0.6
U6b0.4---0.6
K-----
J-----
N1c---0.30.6
A--0.3--
M11.0----
N*/M*/L3*0.21.30.31.93.0
L3b6.98.37.45.99.5
L3b14.33.53.40.34.2
L3d/d25.99.19.15.32.4
L3d11.60.9-1.53.0
L3d30.4-0.90.9-
L3e1-0.90.32.22.4
L3e1a---0.90.6
L3e1b0.2--2.20.6
L3e23.01.72.04.64.2
L3e2b2.83.52.03.18.3
L3e30.2--2.50.6
L3e42.02.22.32.2-
L3e50.8--0.34.2
L3f0.4-0.32.5-
L3f12.64.81.73.72.4
L3h1.81.70.60.90.6
L4g---2.82.4
L210.710.915.61.91.8
L2a4.08.34.35.98.9
L2a18.17.08.57.76.5
L2a1a0.8-0.90.30.6
L2a1b0.4-0.90.3-
L2a1beta32.22.62.04.03.6
L2a1e---0.3-
L2a1d1.20.41.40.60.6
L2a30.4--0.32.4
L2b2.60.91.40.61.2
L2b15.72.61.11.50.6
L2c10.61.32.3-0.6
L2c21.02.62.80.60.6
L2d10.81.30.32.50.6
L2d21.02.21.70.31.8
L5---0.6-
L1b2.22.64.00.6-
L1b110.97.812.85.38.3
L1c*2.24.32.00.90.6
L1c1a0.4-0.31.5-
L1c1a1---3.1-
L1c20.2--3.70.6
L1c3a11.00.91.41.52.4
L1c3b10.60.90.60.91.2
L0a--0.33.1-
L0a12.84.32.32.51.8
L0a1a---2.51.8
L0a2---2.5-
Sample506230352323168
Table Appendix IV.  Haplogroup frequencies, in percentages, for the different ethnolinguistic groups
HaplogroupESSPeuFulWolSerNalBalPapDioBijLimTemMaeBabMalMenLokYorBasBamBakEwoFalTupTalTuaHauMaaPodUldDab
H/HV*/U*/R*--3.3--------0.81.31.8-----------12.55.0----
preHV-------------------------------
V-----------------3.0-------------
U5-------------------------------
U5b-7.17.32.14.3--1.4-----------------------
U6a----------------------------2.63.6-
U6a1--2.02.1---5.8---1.72.0--3.4---------4.2-----
U6b--0.72.1-------------------------3.6-
K-------------------------------
J-------------------------------
N1c-----------------3.0--------5.0----
A---------------1.7---------------
M1----4.3-3.2-6.3----------------------
N*/M*/L3*--0.7-------2.90.8---1.7----2.0--16.05.0--2.72.610.7-
L3b11.111.98.64.2-3.83.25.86.3-11.89.14.015.86.38.510.39.12.26.32.05.712.24.010.04.215.08.15.110.720.0
L3b111.114.35.3-8.7-1.61.43.118.24.40.84.05.33.11.7-3.0--------5.05.47.7-5.0
L3d/d23.72.44.64.24.315.412.78.7-9.15.910.78.77.012.56.813.86.110.94.22.0-9.88.0-4.2--5.13.6-
L3d1-4.83.3----1.4----------2.22.1-1.92.44.0---2.77.7-5.0
L3d3-------2.9----1.31.8----2.24.2-----------
L3e1----------2.9--1.8---3.04.3-8.0----4.2-2.75.1--
L3e1a-------------------2.1-1.9--5.0-----5.0
L3e1b---2.1--------------4.3-6.01.9--5.0---2.6--
L3e2--5.32.1-3.83.2-9.44.51.51.72.03.53.1--6.1-2.14.05.77.312.05.04.25.05.4-10.7-
L3e2b-4.83.32.1-3.8-2.99.4--6.6-3.5-5.1-6.1-4.22.01.92.44.010.0-5.02.717.9-25.0
L3e3----4.3------------3.04.32.16.0-----5.0----
L3e47.4--2.1-3.87.91.4--7.4-3.31.83.1-----8.01.9-8.0-------
L3e5-2.40.7-4.3---3.1--------------4.0---5.47.77.1-
L3f--1.3----------1.8----6.54.22.03.8---------
L3f17.42.42.6----2.99.44.54.45.80.7--5.1-6.14.36.32.01.9-4.010.04.25.02.72.6--
L3h14.8-0.7---1.62.9-4.51.50.81.3----------4.010.04.2-----
L4g------------------2.2-2.07.57.3----2.75.1-5.0
L27.416.78.66.34.315.46.318.815.613.611.89.924.08.812.58.5--2.2-2.0-2.4-5.0-10.0-2.6--
L2a3.74.84.02.14.37.71.67.23.1-10.39.94.01.8-3.417.23.0-6.32.07.517.1-15.020.8-5.47.714.35.0
L2a111.17.15.314.617.4-9.58.73.14.511.84.13.314.021.910.26.912.18.710.46.03.87.38.010.012.55.013.5-3.65.0
L2a1a-2.41.3----1.4-----3.5-1.7-3.0--------5.0----
L2a1b-----3.81.6-----2.0------2.1-----------
L2a1beta3--3.32.1-3.83.21.4-9.1-3.31.35.33.11.7-3.06.58.32.01.97.3---15.0-2.63.65.0
L2a1e-------------------2.1-----------
L2a1d-2.43.3--------0.82.0--1.73.4------8.0-4.2-----
L2a3---4.2---------------2.1-------8.12.6--
L2b--0.76.3-3.83.21.412.5-1.5-0.71.83.11.73.4---2.0--4.0---2.7-3.6-
L2b13.72.40.78.317.415.43.28.76.34.52.90.82.0-3.1--3.0-4.2---------3.6-
L2c17.4--2.1-------1.74.7-3.1----------4.2-----
L2c2--0.7---4.8---5.91.72.71.86.31.73.4----1.9-----2.7---
L2d1--1.3--3.8-1.4---2.5-1.8----4.34.2-7.5----5.0----
L2d2---6.3--3.2---1.53.31.3--5.13.4-----2.4----5.42.6--
L5----------------------4.9--------
L1b--2.06.34.33.81.6-3.14.54.41.72.01.86.36.810.36.1-------------
L1b13.711.914.616.717.4-11.14.33.113.61.59.915.33.59.420.310.312.16.52.16.05.7-8.05.012.55.010.85.110.75.0
L1c*3.7-3.3--7.73.2---1.56.61.3--1.713.83.0-2.1-1.9-------3.6-
L1c1a------1.61.4-----1.8----6.5--3.8---------
L1c1a1--------------------6.013.2---------
L1c2--0.7---------------13.04.28.0---------5.0
L1c3a1--0.72.14.3--1.43.1---2.03.5-------3.84.9-5.0-5.05.4--5.0
L1c3b1------1.61.4--2.9--3.5---3.04.3--------5.4---
L0a-------------1.8----2.28.3-9.4---------
L0a13.72.4---3.811.14.33.19.11.55.02.71.83.11.73.43.02.2-2.03.84.94.0----2.63.65.0
L0a1a-------------------2.18.0-7.3--4.2--2.63.6-
L0a2-------------------4.210.01.9---------
Sample27421514823266369322268121150573259293346485053412520242037392820

Ancillary