Mitochondrial DNA Sequence Diversity in a Sedentary Population from Egypt


*Corresponding author: Dr. Eliane Béraud-Colomb, INSERM U387 – Laboratoire d'Immunologie, Hôpital Sainte Marguerite, 270 Boulevard Sainte Marguerite, BP29, 13274 Marseille Cedex 09, France, Fax Number: (33)+ 04-91-75-73-28. E-mail:

**Present address: Laboratoire de Police Scientifique de Marseille, 97 Boulevard Camille Flammarion, 13248 Marseille Cedex 2, France.


The mitochondrial DNA (mtDNA) diversity of 58 individuals from Upper Egypt, more than half (34 individuals) from Gurna, whose population has an ancient cultural history, were studied by sequencing the control-region and screening diagnostic RFLP markers.

This sedentary population presented similarities to the Ethiopian population by the L1 and L2 macrohaplogroup frequency (20.6%), by the West Eurasian component (defined by haplogroups H to K and T to X) and particularly by a high frequency (17.6%) of haplogroup M1. We statistically and phylogenetically analysed and compared the Gurna population with other Egyptian, Near East and sub-Saharan Africa populations; AMOVA and Minimum Spanning Network analysis showed that the Gurna population was not isolated from neighbouring populations.

Our results suggest that the Gurna population has conserved the trace of an ancestral genetic structure from an ancestral East African population, characterized by a high M1 haplogroup frequency. The current structure of the Egyptian population may be the result of further influence of neighbouring populations on this ancestral population.


The debate on the emergence and dispersion of modern humans has been continuing for several decades (Vigilant et al. 1991; Wolpoff et al. 1997; Templeton, 1997; Ingman et al. 2000), the difficulty being to link current populations to archaeological records. Recently genetic data and archaeological remains have supported Out of Africa migrations of early modern humans, followed by gene flow between local populations (Templeton, 2002).

Mitochondrial genetic data from North Africa are documented by two groups of populations: one composed of populations of the Nile Valley, and the other by populations of the Maghreb. The Nile Valley has been shown to be a migration corridor with populations connected by gene flow (Krings et al. 1999), and phylogeographical analysis of mitochondrial lineages of populations from the Maghreb suggests that modern humans appeared from the Near East following at least two migrations around 50 000 years and 10 000 years ago. A possible migration from Europe may also have occurred during the Neolithic period (Macaulay et al. 1999). The frequencies of the sub-Saharan component, through analysis of L1 and L2 haplogroups in northwestern Africa, are relatively low (from 3% to 14%) (Macaulay et al. 1999; Brakez et al. 2001). The most ancient haplogroup observed in the Maghreb is U6; this haplogroup is included in macrohaplogroup U (which appears to be the oldest haplogroup in Europe). Macrohaplogroup U, characterized by the 12308 HinfI site gain, has an estimated age of 50 400–58 900 years, and may originate from the L3d haplogroup which arose in sub-Saharan Africa and then moved upwards into eastern Africa, and out of eastern Africa to the Middle East (Chen et al. 2000).

Analysis of mtDNA sequence data of Near Eastern populations has shown that 5% of sub-Saharan lineages and 2% of lineages of Eastern Eurasian origin are present in the Levant (Richards et al. 2000). This geographic area appears more and more as a source for human migrations to Europe during the Paleolithic and Neolithic periods (Richards et al. 1996; Cavalli-Sforza et al. 1994). Most of the European mtDNA founder lineages originated in the Near East (Richards et al. 2000), except for the V haplogroup, which originated in Europe (Torroni et al. 2001).

Due to its central position in Northern Africa, between the Maghreb and the Near East, Egypt must have played a crucial role in modern human dispersion. Only a small amount of data is available in data banks, with 68 individuals studied by Krings et al. (1999).

In this article, we report an analysis of mtDNA diversity in a sedentary Egyptian population: the Gurna population. We have studied the mtDNA of 34 individuals, using PCR, sequencing and RFLP analysis. With a frequency of 17.6% for M1 haplogroup, the Gurna population shows a high similarity to the Ethiopian population.

Subjects and Methods


The 58 subjects for this study were volunteers and maternally unrelated. All maternal grandmothers of these individuals were born in Upper Egypt, and for the majority (34 individuals) grandmothers were born in Gurna near Luxor (see Figure 1). Gurna individuals hold an ancient cultural oral tradition that they consider as coming from ancient Egypt, and the inhabitants are sedentary people and quite isolated from recent influence (as opposed to those of a large metropolis).

Figure 1A.

Map showing the localization of studied populations. Black square corresponds to Berber population in Maghreb (MAG). Black hexagon corresponds to Bedouin population (BED). Striped triangle indicates Turkish population (TUR). Striped star indicates Syrian population (SYR). Black triangle corresponds to Palestinian (PAL) and Druze (DRZ) populations. Black star indicates area for Nubian populations (Kerma (KER) and Dongola (DON)). Striped circle corresponds to the area of Nuba (NUA) and Shilluk (SIL) populations. Black circle corresponds to Nuer (NUE) and Dinka (DIN) populations. Striped square corresponds to Kenyan population. Details of crossed area are given in Figure 1B. MAN corresponds to Mansoura population. CAI corresponds to Cairo population. ASI indicates Assiout population. GUR represents Gurna population and ASU represents Assuan population.

Figure 1b.


The Gurna population (GUR) comprised 34 individuals from Gurna. The population named “Upper Egypt” (UPE) comprised 32 individuals from the Upper Egypt area, originating from towns other than Gurna. This group was made up of 24 individuals from our study (Table 2) and 8 individuals from Krings et al. (1999), originating from the same area (sequences were retrieved from the MOUSE databank (Burckhardt et al. 1999)).

Table 2.  Sequence polymorphisms of the HVS-I Region of Upper Egyptian individuals
Subject ID#HVS-I variation(a)
  1. (a)Relative to the CRS (Anderson et al. 1981). Mutations are transitions unless the base change or an indel is specified explicitly. Nucleotide positions followed by C/T indicate heteroplasmy.

  2. (b)Using the mismatched primer described in Torroni et al. 1996.

UPE116069, 16126,
UPE616093, 16224, 16311
UPE916126, 16163, 16186, 16189, 16271, 16294
UPE1016123, 16145, 16178, 16223,16294, 16390
UPE4116223, 16311
UPE4216145, 16176, 16223, 16390
UPE4316086, 16129, 16148, 16166, 16183, 16187, 16189, 16278, 16311,
 16355, 16362
UPE3116129, 16189, 16223, 16249, 16311, 16359
UPE4716261, 16356, 16377
UPE4916093, 16129, 16223, 16391
UPE5016129, 16148, 16168, 16172, 16187, 16188, 16189, 16223, 16230,
 16311, 16320, 16348
UPE5116209, 16223, 16278, 16294, 16301, 16354, 16390
UPE5216148, 16168, 16172, 16187, 16188, 16189, 16223, 16230, 16287,
 16293, 16311, 16320
UPE5316172, 16189, 16223, 16278, 16294, 16309, 16391
UPE5416192, 16298, 16311
UPE5616169, 16193, 16195, 16266, 16301
UPE6016069, 16126, 16214, 16231, 16311
UPE6116300, 16362
UPE6216147, 16172, 16223, 16248, 16311, 16355
UPE6416069, 16126, 16136, 16145, 16222, 16261
UPE6516223, 16311, 16362
UPE6616176, 16223, 16390
UPE6816145, 16176, 16223, 16286, 16311, 16390
UPE3816137, 16223, 16292

Other individuals, including Egyptian, Nubian and Sudanese (Krings et al. 1999), Kenyan (Watson et al. 1996; Horai & Hayasaka, 1990), Berber (Corte-Real et al. 1996; Pinto et al. 1996), Palestinian, Syrian, Druze, Turkish and Bedouins from Saudi Arabia (Richards et al. 2000), were also considered.

DNA Extraction and mtDNA Analysis

DNA was extracted from hair roots as described elsewhere, using the Chelex protocol (Walsh et al. 1991). Hypervariable segments 1 and 2 (HV1 and HV2) of the control region were amplified by PCR using primer pairs and amplification conditions described previously (Mogentale-Profizi et al. 2001; Torroni et al. 1997). Sequencing was carried out by Genome Express (France).

To classify mtDNAs unambiguously within haplogroups, mtDNAs were screened for diagnostic RFLP markers selectively chosen on the basis of the sequence variation observed in HV1 and HV2.

Statistical Analysis

The DNA sequences were aligned using Clustal X ver. 1.81. (Thompson et al. 1997). The MEGA program (Kumar et al. 1994) was employed to identify each haplotype using the complete deletion option, with no assumption of traditional haplotypes definition. In this way 278 nucleotidic positions were conserved (for each of the 322 individuals studied for the analysis of Gurna vs. other Nile valley populations) and 281 nucleotide positions were conserved (for each of the 890 individuals studied for the analysis of Gurna vs. Near East populations). The genetic structure of the population was analyzed using AMOVA (24) and F or φ statistics (Excoffier et al. 1992; Weir & Cokerham, 1984) of various populations using the Arlequin 2.0 program (Schneider et al. 2000). We estimated the genetic variation attributable to differences among localities using pre-specified hierarchical groups (φsc), among localities across the entire study area (φst), and among pre-specified hierarchical groups (φct).

An exact test of population differentiation was performed using a Markov chain with 1000 iterations. The Kimura 2P distance option was selected.

The pairwise Fst can be used to describe the short-term genetic distance between populations, with the application of a slight transformation to linearize the distance with population divergence time (Reynolds et al. 1983). A phenogram based on 12 populations using Fst values (on a matrix of coancestry coefficients (Reynolds et al. 1983)) and the Neighbor-Joining (NJ) (Saitou & Nei, 1987) clustering algorithm was calculated. A control population (South African Bantu speaking population, Soodyall et al. 1996) was used as an outgroup.

A network was drawn using the TCS 1.13 package. This is a method that connects existing haplotypes in a minimum spanning tree, which is essentially a parsimony method. This method starts calculating the overall limits of parsimony for the complete data set using a statistic from neutral coalescent theory (Hudson, 1989). The phylogenetical reconstruction algorithm was described in (Templeton et al. 1992).


Sequence Data of Gurna Population

Both hypervariable segments (HV1 and HV2) of the control region were sequenced. Sequence polymorphisms and results of RFLP analysis are reported in Table 1 for the 34 individuals from Gurna. Sequence polymorphisms of the HV1 Region for the 24 individuals from Upper Egypt are reported in Table 2.

Table 1.  Sequence polymorphisms of the Control Region, RFLP analysis and Haplogroups of Gurna
Subject ID#HVS-I variation(a)HVS-II variation(a)Diagnostic RFLP markersHaplogroup
GUR2316140, 16218−7025 AluIH
GUR27−7025 AluIH
GUR3116209C, 16210, 16224152C/T−7025 AluIH
GUR44146−7025 AluIH
GUR4816148, 16256, 16304, 16390−7025 AluIH
GUR3416093, 16129, 16223, 1639173, 152C/T, 199, 204+10032 AluII
GUR3916129, 16223, 1639173, 199, 204+10032 AluII
GUR2216069, 16126, 16193, 16300,73, 152−13704 Bst0IJ
GUR2616066T, 16069, 16093,73, 150, 152, 185−13704 Bst0IJ
 16126, 16193   
GUR2,16129, 16148, 16168, 16172,64, 93, 152, 185,+3592 HpaI, +11641L1a
GUR15,16187, 16188G, 16189, 16223,189, 200HaeIII 
GUR6316230, 16311, 16320   
GUR816129, 16148, 16168, 16172,64, 93, 185, 189,+3592 HpaI, +11641L1a
 16187, 16188G, 16189,200HaeIII 
 16223, 16230, 16294G,   
 16311, 16320   
GUR1816129, 16148, 16166, 16187,73, 182, 195+3592 HpaI, +10806L1e
 16189, 16223, 16278, 16290, HinfI, +12810 RsaI 
 16360, 16390   
GUR3216129, 16148, 16166,73, 182, 195+3592 HpaI, +10806L1e
 16187,16189, 16223, 16278, HinfI, +12810 RsaI 
 16290, 16311, 16360, 16390   
GUR3316183C, 16189, 16278,73, 146, 152, 195+3592 HpaI, +13803L2a
 16294,16390 HaeIII 
GUR1916223, 16311, 1636273, 150−2349 MboI, +8616L3*
   MboI, −10084 TaqI, −10394 
   DdeI, −10871 MnlI 
GUR2116104, 16181C, 16182C,73, 146, 153+10871 MnlI, −11718 HaeIII(b),L3*
 16183C, 16189, 16311 −12308 HinfI(c) 
GUR3716126, 16214, 16231, 1631173, 150, 195+10394 DdeI, +10871 MnlI,L3*
   −11718 HaeIII(b), −12308 
GUR4616169, 16193, 16195, 16266,73, 150, 152+10394 DdeIL3*
GUR716093, 16120T, 16129,73, 195+10397 AluIM1
 16183C, 16189, 16223,   
 16311, 16359   
GUR1416129, 16189, 16223, 16249,73, 146, 195+10397 AluIM1
 16261, 16311, 16359   
GUR2016129, 16183C, 16189,73, 195+10397 AluIM1
 16223, 16311, 16359   
GUR1216129, 16183C, 16189,ND (d)+10397 AluIM1
 16223, 16249, 16311   
GUR2916189, 16223, 16292,73, 195+10397 AluIM1
 16293C, 16311, 16359   
GUR3516085G, 16129, 16183C,73, 195, 210C+10397 AluIM1
 16189, 16223, 16249, 16311   
GUR17,16145, 16176G, 16223,73, 152+10237 HphI, +10871N1b
GUR4016390 MnlI 
GUR2816126, 16145, 16176G,73, 152, 195+10237 HphI, +10871N1b
 16223, 16390 MnlI 
GUR316126, 16294, 16296, 1632473, 195+13366 BamHIT
GUR5716126, 16146, 16172, 16239,73, 146, 195+13366 BamHI, +4216 NlaIIIT3
Subject ID#HVS-I variation(a)HVS-II variation(a)Diagnostic RFLP markersHaplogroup
  1. (a)Relative to the CRS (Anderson et al. 1981). Mutations are transitions unless the base change or an indel is specified explicitly. Nucleotide positions followed by C/T indicate heteroplasmy.

  2. (b)Using the mismatched primer described in Richards et al. 2000.

  3. (c)Using the mismatched primer described in Torroni et al. 1996.

  4. (d)ND: Not determined.

 16264, 16292, 16294   
GUR131634373, 150, 195+12308 HinfI(c)U3
GUR11,16261, 1635673, 195+12308 HinfI(c)U4

For the Gurna population, 59 sites out of the 380 sites sequenced on HV1 were polymorphic with reference to the CRS (Anderson et al. 1981), and 15 sites out of the 240 sequenced nucleotides on HV2 were polymorphic. All nucleotide changes were reported, as 64 transitions and 12 transversions.

Diversity of Haplotypes

Among the 34 individuals studied 30 different haplotypes were observed (Table 1). Of these, only three were previously seen in the Nile valley populations (Krings et al. 1999) and five in the Nubian population (Krings et al. 1999). 11 haplotypes found in the Gurna population were present neither in the Nubian nor the Nile valley populations.

Haplogroup Distribution

The haplogroup distribution of the Gurna population is shown in Table 3. The main haplogroups observed in this population were M1 (6/34 individuals, 17.6%), H (5/34 individuals, 14.7%), L1a (4/34 individuals, 11.8%) and U (3/34 individuals, 8.8%).

Table 3.  Distribution of mtDNA Haplogroups in Gurna population
HaplogroupNumber of individualsPercentage
  1. (a)in +10873MnlI L3 subset.

  2. (b)in −10873MnlI L3 subset. All haplogroups are defined according to Macaulay et al. (1999) and Torroni et al. (1996).


The M1 haplogroup frequency in Gurna is similar to that observed in the East African population (20%) (Passarino et al. 1998). The M1 haplogroup was observed in the Nile Valley population (7%), Nubian population (10%) (Krings et al. 1999), Bedouins (7.1%), Druzes (2.2%) and Palestinians (1.7%), but it was not found in any other Near Eastern populations (Richards et al. 2000).

66.7% (4/6) of M1 haplotypes from Gurna exhibited the 16183C transversion, while none of the M1 haplotypes in Ethiopian/Kenyan (Quintana-Murci et al. 1999) and Middle Eastern populations harboured this mutation (Richards et al. 2000). This mutation was also observed in one M1 individual from the Upper Egypt group and in two non-M1 haplotypes from Gurna. One M1 haplotype (GUR20) found in the Gurna sample was shared with one individual from the Nubian sample (from the Kerma group).

The H haplogroup frequency in the Gurna population was lower than the H haplogroup frequency in other Middle Eastern populations (greater than 20%, except for Druzes (13%), Yemeni Jews (2.3%) and Bedouins (0%)). Compared to Nile Valley populations, the H haplogroup frequency in the Gurna population was lower (14.7% vs. 17.8%), whereas it was higher than that observed in Nubia (14.7% vs. 8.8%).

In the Gurna samples, J haplogroup distribution (5.9%) was in the same range as that observed in Near Eastern populations, i.e. between 6 and 10% (except for Iraqis (13%), Bedouins (18%) and Yemeni Jews (28%)). The J haplogroup was also found in the Nile Valley populations in the same range of values (5.6%).

Haplogroup L1 encompassed 17.6% of the haplotypes found in the Gurna population (with 11.7% of L1a and 5.9% of L1e). This value is higher than that observed in the Near East (below 3.6% for all populations in Near East) (Richards et al. 2000), and was lower than the value observed in Nubian populations (23% of L1 haplogroup and 13% of L1a haplogroup).

Haplogroup U encompassed 8.8% of Gurna haplotypes. This super haplogroup, including all the sub-haplogroups of U (U1 to U7) and the K haplogroup, was found with a value between 16% and 30% in most of the Near Eastern populations, except in the Yemeni Jewish (9.3%) (10), Nile Valley (9.9%) and Nubian (3.8%) populations (Krings et al. 1999).

Sub-haplogroup U4 encompassed 5.9% of Gurna haplotypes but its frequency was less than 3.7% in all populations from the Near East (except the Azeri population) (Richards et al. 2000) and it was absent in Nile Valley and Nubian populations.

Haplogroup U3 was also present in the Gurna population with a frequency of 2.9%. Haplogroup U3 ranged from 0.9 to 5.2% in most of the Near Eastern populations (with the exceptions of Turkish (7.5%) on the one hand and Yemeni Jews and Druzes (0%) on the other). The value of its frequency is equal to 2.5% and 1.4% respectively in Nubian and Nile Valley populations.

RFLP Analysis

Gurna population data were expressed according to the classification used by Passarino et al. (1998) for their study of an Ethiopian population (Table 4). The two groups defined by HpaI3592 (+) and DdeI10394AluI10397 (++) were similar in Ethiopian and Gurna populations (24.7% vs. 20.6% and 20.3% vs. 17.6% respectively). For the others groups (DdeI10394AluI10397 (+−) and DdeI10394AluI10397 (–)), the two populations were not statistically different (using the χ2 test, P = 0.318).

Table 4.  RFLP analysis according to Passarino et al. (1998)
RFLP(a)Corresponding Haplogroup(b)Ethiopian population(c)Gurna population
  1. (a)According to Passarino et al. 1998.

  2. (b)According to Torroni et al. 1997.

  3. (c)Ethiopian population data were published in Passarino et al. 1998.

3592 HpaI +All L1 and L224.7%20.6%
10394 DdeI +M20.3%17.6%
10397 AluI +   
10394 DdeI +I, J, K, all L155.4%41.2%
10937 AluI −and L2  
10394 DdeI −H, T, U, V, W24.3%38.2%
10397 AluI −and X  

Northern/Southern type Analysis

Northern/Southern type classification was carried out using the status of nt 16223 and nt 16311 and the status of restriction site 3592 HpaI as described elsewhere (Krings et al. 1999). This analysis showed that the Gurna population data was intermediate between that observed in Nile Valley and Nubian populations (Table 5).

Table 5.  Northern/Southern type analysis(a)
mtDNA type(b)Sample sizeN° of mtDNA typePercentage
  1. (a)According to Krings et al. 1999.

  2. (b)Nile Valley population data are from Krings et al. 1999.

 Nile valley514574.6
 Southern Sudan11919.7
 Nile Valley171325.4
 Southern Sudan655680.3

Statistical Analysis of Gurna and Nile Valley Populations

The Gurna sample was compared to the Nile Valley populations (Krings et al. 1999) and the Upper Egyptian population. South African Bantu, taken as an outgroup population, clearly appears different from all these populations using the coancestry coefficient matrix (0.465 <Fst < 1.013, the lowest value observed corresponds to the Dinka population and highest value to the Shilluk population) (Appendix Ia).

Table Appendix Ia.  Matrix of coancestry coefficients as t/M =−ln(1-FST)(a)
  1. (a)(M = N for haploid data).


According to the population pairwise differentiation test (Appendix Ib), most of these populations appear not significantly different (with alpha value equal to 0.01), with the exception of the outgroup which is different from all the other populations. Only the Nubian population from Kerma and the Sudanese population from Dinka display a significant difference with some populations (Kerma with Assiout, Mansoura, Upper Egypt and Dongola populations, and Dinka with Kerma and Mansoura populations), although the Fst values are under 0.10 for all these populations.

Table Appendix Ib.  Matrix of M values(a)
  1. (a) (M = N for haploid data).


The beginning of differentiation was observed between the Dongola population and two other Sudanese populations (Nuba and Shilluk) (Fst = 0.14005 for the Nuba group and Fst = 0.14184 for the Shilluk group). At a minor level, the Dongola population was differentiated from the Assuan, Nuer and Dinka populations (Appendix I). The Gurna population was not statistically different from the other populations (0 < Fst < 0.06346), except from the outgroup. The number of migrants per generation (Nem) is also significant. The outgroup population present at with a very low Nem, with Nem < 0.85 for all the populations. Nem is greater than 3 between all populations excluding the outgroup. The Gurna population presents with high values of Nem between most populations (Nem > 30 for Cairo, Kerma, Dongola and Assuan populations, highest value observed with the Upper Egyptian population (Nem = infinite).

Other populations presented high Nem values (Nem > 50): the Kerma population with Nuer and Upper Egyptian populations (Nem = 61.87 and Nem = 145.98 respectively), Shilluk population with Nuba and Nuer populations (Nem = 52.49 and Nem = 51.08 respectively), and Upper Egyptian with Mansoura population (Nem = 767.45). Some populations presented infinite values of Nem (Assiout with Mansoura and Upper Egyptian populations, Cairo with Assuan population, and Kerma with Upper Egypt, Dinka and Nuba populations) (Appendix Ic). This is due to the fact that the Fst parameter may be negative, corresponding to a negative intraclass correlation. Genetically, it means that alleles are more related between than within populations (Weir, 1996).

Table Appendix Ic.  Differentiation test between all pairs of samples(a)
  1. (a) Markov chain length : 10000 steps. Non-differentiation exact P values.


Genetic diversity and nucleotidic diversity were not significantly different for the whole data set. No significant correlation was found between them (data not shown).

For the population from Egypt and the Sudan the AMOVA revealed a spatial partitioning of the genetic variation at two hierarchical levels. Most of the variance was attributable to differences among hierarchical groups (φct = 0.109) and among localities across the entire study area (φst = 0.139). Without the outgroup population these values were respectively equal to φct = 0.009 and φst = 0.037, indicating a huge impact of the outgroup population.

A Neighbor-Joining tree was drawn using Fst. This tree shows three groups of populations: the first group included Upper Egypt, Assiout and Mansoura, the second group included Cairo, Assuan and Gurna and the last group included the Kerma and south Sudanese populations. The Gurna population appears at the apex, on the central branch of this tree considering the other branches: Northern and Southern populations of Egypt. The Cairo and Assouan populations were positioned at the midpoint of the tree, corresponding to an admixture of recent migrations. The Dongola population were removed from the tree due to long branch artifactual position (equal to the rest of the tree) (Figure 2).

Figure 2.

Unrooted phenogram of Egyptian, Nubian and Sudanese populations using Fst value and Neighbor-Joining (NJ) clustering algorithm. SIL: Shilluk population, NUA: Nuba population, DIN: Dinka population, NUE: Nuer population, KER: Kerma population, CAI: Cairo population, ASU: Assuan population, GUR: Gurna population, DON: Dongola population, UPE, Upper Egypt population, ASI: Assiout population and MAN: Mansoura population.

Haplotype Network and Nested Design

A minimum spanning network was drawn for the Gurna population (data not shown) by calculating the maximum number of mutational steps between haplotypes, and allowing parsimonious connections with a probability equal to or higher than 0.95. The unrooted parsimony cladogram constructed using the subsets of haplotypes within this limit yields three disjointed networks.

Network I includes 25 haplotypes representing the haplogroups H, U, T, J, N1b, I, M1, L3 and L2. The centre point of network I was an individual from haplogroup H. All of the L3 and L2 haplogroup individuals are in this network. Network II includes two haplotypes representing haplogroup L1e and network III includes two haplotypes representing haplogroup L1a.

Statistical Analysis of Gurna, Near Eastern and sub-Saharan Populations

For this analysis, we considered only the Gurna population, the Egyptian population from Mansoura and Upper Egypt, the Nubian population from Kerma and the Sudanese Dinka population, due to the size of the populations. Other low size samples were removed to avoid analytical and phylogenetic artifacts. These populations were compared to Near Eastern (Richards et al. 2000), Berber (Corte-Real et al. 1996; Pinto et al. 1996) and Kenyan populations (Horai & Hayasaka, 1990; Watson et al. 1997).

The Gurna sample presents an Fst (φ) range from 0 (with Upper Egypt) to 0.10867 (with Berber) (except with the outgroup Fst of 0.56) (Appendix IIa). These relatively low values place Gurna as being as similar to Near Eastern populations as to Sudanese and sub-Saharan populations.

Table Appendix IIa.  Matrix of coancestry coefficients as t/M =−ln(1-FST)(a)
  1. (a)(M = N for haploid data).


Near Eastern populations (Palestinian, Syrian, Bedouin and Druze) from the Levant show striking similarity according to the Fst and pairwise population differentiation test (for these three populations, Fst values ranged from 0 to 0.019) (Appendix IIb). The Turkish population displays a similar pattern with Fst values with this sample between 0 and 0.028.

Table Appendix IIb.  Matrix of M values(a)
  1. (a)(M = N for haploid data).


On the other hand, the Dinka and Kenyan populations were clearly different from Near Eastern populations (Palestinian, Syrian, Bedouin, Druze and Turkish) with Fst values ranging from 0.13 to 0.23 (Appendix IIa). The Dinka and Kenyan populations show differences with the Berber population too (Fst = 0.19 and 0.17 respectively).

Regarding Nem values (Appendix IIc), the number of migrants per generation seems to be higher within each regional area: the highest values are observed between Palestinians and Turks (Nem = 353.57) and between Syrians and Turks, and Syrians and Palestinians. Nem values between other Near Eastern populations (Syrian, Palestinian, Bedouin, Druze and Turkish) were all higher than 17 (Appendix IIc). The Mansoura population also shows high Nem values with the populations of this group (16.34 < Nem < 68.06). A High Nem value is also observed between the sub-Saharan populations (Sudanese Dinka and Kenyan: Nem = 168.32).

Table Appendix IIc.  Differentiation test between all pairs of samples(a)
  1. (a)Markov chain length : 10000 steps. Non-differentiation exact P value.


For the Gurna population, Nem values are mostly below 10, except with the Nubian Kerma population (Nem = 33.89), Bedouins (Nem = 11.89), Kenyans (Nem = 12.71) and Mansoura (Nem = 15.55)and Upper Egyptian (Nem = infinite) populations.

The Upper Egyptian population shows a very high Nem value with the Mansoura (Nem = 767.45) and with the Kerma (Nem = 145.98) and also reveals a significant level of Nem with the Near Eastern population (19 < Nem < 78).

Genetic diversity was significantly different for the Druze population and the Berber population in comparison to the other populations, but nucleotidic diversity was not significantly different. As for the Egypt-Sudan study no significant correlation was found between them (data not shown).

The AMOVA revealed the same impact of the outgroup. Most of the variance was attributable to differences among hierarchical groups (φct = 0.139) and among localities across the entire study area (φst = 0.142). Without the outgroup population these values were respectively φct = 0.079 and φst = 0.082.

An unrooted Neighbor-Joining tree was drawn using Fst. Two groups of populations appear (Figure 3), Near Eastern and sub-Saharan. Gurna appears as a third group with a zero branch length.

Figure 3.

Phenogram based on Egyptian and Near Eastern populations using Fst value and Neighbor-Joining (NJ) clustering algorithm. DIN: Dinka population, KER: Kerma population, GUR: Gurna population, MAN: Mansoura population, BED: Bedouin population, PAL: Palestinian population, SYR: Syrian population, DRZ: Druze population, TUR: Turkish population, MAG: Berber population from Maghreb and KNY: Kenyan population. Gd indicates genetic diversity. Nd indicates nucelotidic diversity.


Analysis of mitochondrial DNA diversity in a sedentary Egyptian population from Gurna shows that the M1 haplogroup is present in this region at a frequency similar to that observed in the Ethiopian population. The level of the M1 haplogroup in Gurna (17.6%) is the second highest value obtained for this haplogroup, just below the value of the Ethiopian population (20%) (Quintana-Murci et al. 1999). RFLP groupings of Gurna samples carried out as established by Passarino et al. (1998) showed that the Gurna population was close to the Ethiopian population also for L1 and L2 macrohaplogroup frequency and by a West Eurasian component which seems similar in frequency but different in haplogroup distribution. On the other hand, despite its M1 specificity, statistical analysis of haplotypes of Gurnawis indicated that the Gurna population was not isolated from all other neighbouring populations. It was not statistically different from any other Egyptian or Sudanese populations, exhibiting an intermediate genetic status between European and sub-Saharan according to Northern/Southern analysis (following criteria described in Krings et al. (1999). The Gurna population appeared statistically as close to Near Eastern as to sub-Saharan and Sudanese populations, even though these two groups were statistically different.

The Gurna area could be the meeting point of two independent waves of migration from the Near East and from sub-Saharan Africa, as suggested by the central position of the Gurna population in the unrooted NJ tree and the genetic and the nucleotidic diversity of the analysed populations. The presence in the Gurna gene pool of haplogroups found in Near Eastern populations but absent in sub-Saharan ones (like U4), and haplogroups found in sub-Saharan populations but only sporadically present in Near Eastern ones (like L1), reinforces this observation.

However, the Gurnawi gene pool does not consist of a simple combination of Near Eastern and sub-Saharan gene pools, but also includes an East African specific component. This situation has already been observed for the Ethiopian gene pool (Passarino et al. 1998). Thus, the report of a second population in this geographic area showing a similar distribution of mtDNA haplotypes, including the same high frequency of a specific haplogroup (M1), raises the question of a hypothetical presence of an ancestral East African population. Such a population, as evoked by Passarino et al. (1998) for Ethiopia, could have settled on a wider area from Egypt to Ethiopia (including Sudan), the differences observed in current populations being due to further influences from neighbours (South Arabian peninsula for Ethiopia (Maca-Meyer et al. 2001), sub-Saharan input for Sudan as demonstrated in this study by a high exchange rate between Sudanese and Kenyan populations). A similar hypothesis of the existence of an ancestral population characterized by a specific haplogroup could also be evoked in the Maghreb with the U6 haplogroup (Brakez et al. 2001; Rando et al. 1998).

The results of this study point to a genetic structure of the Gurna population similar to that of the Ethiopian one. This population structure has probably been conserved in some other Egyptian populations even though those which have already been analyzed, such as Mansoura, Assiout and Cairo, failed to show the same characteristics. Mansoura, Assiout and Cairo are very big cities with much continuous and current admixture of individuals from several other regions and countries forming great melting pots. Consequently, data from these great conurbations could be somewhat biased. More extensive investigation of the genetic structure of Egyptians from other villages and from Ethiopian and Sudanese populations will be required to complete the understanding of the structuring of the current population from the ancestral East African population.


We would like to thank Vincent Grimaud, Director of the Cultural Center of the French Embassy in ARE, and Dr Sixte Blanchy, for their help in this cooperative project. We are grateful to Vincent Macaulay for providing us with Near East sequence data and Dr. Antonio Torroni for his help in the RFLP analysis. We thank Dr Marie-Odile Rousset for her help in collecting samples and Dr Claude Mawas and Dr Pierre Pontarotti for allowing us to develop part of this work in their laboratory. A word of thanks is also due to Mr Bernard Mathieu, director of IFAO Cairo, for their financial support. Finally, we would like to thank Becky Tagget for her help in English. E Bouzaïd is supported by a grant from the Conseil Régional PACA. This work was supported by INSERM.


Appendix I Statistical analysis of data for Gurna and Nile valley populations

List of labels for population samples used in this table:

1: Assiout population, 2: Cairo population, 3: Gurna population, 4: Outgroup population, 5: Assuan population, 6: Kerma population, 7: Nuba population, 8: Dongola population, 9: Nuer population,10: Mansoura population, 11: Dinka population, 12: Shilluk population and 13: Upper Egypt population.

Appendix II Statistical analysis of data for Gurna, Near Eastern and sub-Saharan populations

List of labels for population samples in this table:

1: Bedouin population, 2: Palestinian population, 3: Syrian population, 4: Druze population, 5: Turkish population, 6: Kerma population, 7: Outgroup, 8: Dinka population, 9: Berber population from Maghreb, 10: Gurna population, 11: Kenyan population, 12: Mansoura population and 13: Upper Egypt population