Diversity of Mitochondrial DNA Lineages in South Siberia


*Corresponding author: Dr. Miroslava V. Derenko, Genetics Laboratory, Institute of Biological Problems of the North, Portovaya str., 18, 685000 Magadan, Russia Fax/Phone: 7 41322 34463. E-mail: mderenko@mail.ru


To investigate the origin and evolution of aboriginal populations of South Siberia, a comprehensive mitochondrial DNA (mtDNA) analysis (HVR1 sequencing combined with RFLP typing) of 480 individuals, representing seven Altaic-speaking populations (Altaians, Khakassians, Buryats, Sojots, Tuvinians, Todjins and Tofalars), was performed. Additionally, HVR2 sequence information was obtained for 110 Altaians, providing, in particular, some novel details of the East Asian mtDNA phylogeny. The total sample revealed 81% East Asian (M*, M7, M8, M9, M10, C, D, G, Z, A, B, F, N9a, Y) and 17% West Eurasian (H, U, J, T, I, N1a, X) matrilineal genetic contribution, but with regional differences within South Siberia. The highest influx of West Eurasian mtDNAs was observed in populations from the East Sayan and Altai regions (from 12.5% to 34.5%), whereas in populations from the Baikal region this contribution was markedly lower (less than 10%). The considerable substructure within South Siberian haplogroups B, F, and G, together with the high degree of haplogroup C and D diversity revealed there, allows us to conclude that South Siberians carry the genetic imprint of early-colonization phase of Eurasia. Statistical analyses revealed that South Siberian populations contain high levels of mtDNA diversity and high heterogeneity of mtDNA sequences among populations (Fst = 5.05%) that might be due to geography but not due to language and anthropological features.


Analysis of mitochondrial DNA (mtDNA) polymorphism has become a useful tool for human population and molecular evolution studies, allowing researchers to infer the pattern of female migrations and peopling of different regions of the world (Wallace, 1995). The use of the phylogeographic approach has allowed refinement of the analysis of maternal mtDNA lineages, suggesting the current model of complex demographic scenarios for the peopling of Eurasia (Richards et al. 2000). It has been shown that in the present-day Eurasian populations the mtDNA variation can be classified into two macrohaplogroups, M and N, both of them coalescing to the African macro-cluster L3, which can be considered as the most recent common ancestor of all non-Africans (Macaulay et al. 1999; Quintana-Murci et al. 1999). Macrohaplogroup M encompasses a number of East Asian-specific haplogroups, including C, D, G, E, Z and other recently described haplogroups M7, M8, M9, M10, designated previously as M* (Yao et al. 2002). Macrohaplogroup N encompasses multiple East Asian-specific lineages, including A, B, F, R9a, Y, and N9a, as well as the West Eurasian haplogroups HV, JT, UK, I, W, and X (Macaulay et al. 1999; Yao et al. 2002). While West Eurasian mtDNA variation is now quite well understood, there are only a few studies dealing with combined RFLP and HVR1 data for East Asian and Siberian populations (Torroni et al. 1993; Kolman et al. 1996; Starikovskaya et al. 1998; Schurr et al. 1999; Derenko et al. 2000; Forster et al. 2001; Derbeneva et al. 2002; Yao et al. 2002). Moreover, most of these studies were in fact generally motivated by research into the origin of Native Americans, and thereby the phylogeographic pattern of mtDNA differentiation in East Asian and Siberian populations is still poorly understood.

A particular significance of the genetic research of Altai and neighboring regions lies in the geographic location of the Altai area on the margin of southwestern part of Siberia, believed to be the main gateway for the initial peopling of the remainder of Siberia from the southern regions of Central Asia. Archaeological data suggest that South Siberia could have already been inhabited by modern humans in the late Pleistocene. The earliest Upper Paleolithic industries occurred in the Altai region and dated as 43300 ± 1600 years B.P. is thought to be linked with a gradual transformation of a Mousterian tradition by introduction of the more progressive elements during the early stage of the midlast glacial interstadial interval (45000 − 30000 years B.P.) (Derevianko, 1998; Goebel, 1999). A similar “mixed” technological character observed in the Altai can be found in most of the other Late Paleolithic stone industries in Siberia dating to second half of the Karginsk interstadial interval. Analogous combinations of both the Mousterian and Late Paleolithic elements are documented in the Transbaikal and Angara River basin regions, as well as in the upper Yenisei area (Chlachula, 2001). They also have been recorded in synchronous industries in Mongolia and North China thus testifying the similar processes of the cultural development as in Siberia (Okladnikov, 1981; Derevianko, 1998). During the following interval, i.e., at the end of the Karginsk interstadial and in the early last glacial (Sartan) stage, more progressive Late Paleolithic cultures with the advanced technique of the prismatic core flaking reminiscent of the European traditions emerged in vast areas of Siberia: on the eastern margin of the West Siberian Lowland, in the upper Yenisei River basin, as well as in the Angara River basin (Derevianko & Zenin, 1996; Vasiliev et al. 1999). Such industries, however, are absent in the Altai area, indicative of complex and regionally divergent cultural evolution in different parts of Siberia (Chlachula, 2001).

Unfortunately, the archaeological records alone with the lack of human skeletal remains are inconclusive about the anthropological traits which were characteristic for the Upper Paleolithic Siberian population. Yet, the Upper Paleolithic artifacts from 18 000 B.P. already have been found in association with skeletal remains that bear similar morphology with contemporary anatomically modern humans teeth from Europe (Turner II, 1987). However, in their cranial traits, Siberians from the Neolithic period are seen as more closely related to northern Chinese and Mongolians than to Europeans (Alexeev & Gohman, 1984; Alexeev, 1989). Similarly, sinodontic teeth appear as characteristic for northern Chinese, Siberians, and Native Americans from 17000 B.P. onwards (Turner II, 1987).

Meanwhile, according to paleoanthropological data, the Europeoid traits became prevalent among steppe zone inhabitants of Tuva, Altai, Khakassia and West Mongolia since the Bronze Age or even earlier (Alexeev & Gohman, 1984; Alexeev, 1989). Subsequent migrations from Central and Eastern Asia led to the formation of anthropological traits seen in the contemporary population of South Siberia. Thus, the considerable ethnic diversity in Southern Siberia was largely shaped by migration processes that had occurred since the initial colonization of the region in Upper Paleolithic. Historically, a complex network of migrations can be traced from Central, Eastern Asia and Western Eurasia (Alexeev, 1989; Vasiliev, 1993; Derevianko, 1998).

Although classical genetic data (Cavalli-Sforza et al. 1994; Rychkov et al. 2000) cover most of the present-day populations living in South Siberia, the high-resolution mtDNA and Y-chromosomal data sets for the populations living in this region are either incomplete or virtually absent. To investigate the origin and evolution of aboriginal populations of South Siberia, we performed a comprehensive mtDNA analysis (HVR1 sequencing combined with RFLP typing) of seven Altaic-speaking populations, occupying the broader area of Altai and Baikal regions, and compared them with the populations of Eastern and Central Asia that have had a great historical influence on Southern Siberians.

Material and Methods

Population Samples

A total of 480 hair root samples from unrelated individuals were collected from the following seven South Siberian autochthonous groups: Turkic-speaking Altaians, Tuvinians, Eastern Tuvinians (Todjins), Tofalars, Sojots, Khakassians and Mongolic-speaking Buryats (Figure 1). Information about birthplace, parents and grandparents was obtained from all donors.

Figure 1.

Geographic locations of the South Siberian populations studied. Populations are coded as: AL – Altaians, KH – Khakassians, BR – Buryats, ST – Sojots, TD – Todjins, TV – Tuvinians, TF – Tofalars.

The Altaian individuals (n = 110) came from five different districts of the Altai Republic. In detail, there were Telenghits from the Kosh-Agach and Ulagan districts, Altai-Kizhi from the Onguday, Ust-Koksa, Ust-Kan and Shebalinsk districts, Maimalars from the Shebalinsk and Maiminsk districts, and Tubalars and Chelkans from the Turochak district. The Altaians, the native people of the Altai Republic (South Siberia), number up to 70000 persons. ‘Altaians’ is the common denomination for seven formerly distinct Turkic-speaking groups - Altai-Kizhi, Teleuts and Telenghits, who represent Southern Altaians, and Chelkans, Kumandins, Tubalars and Maimalars, who represent Northern Altaians. The differences between southern and northern Altaians are well established, on the basis of anthropological, linguistic and classical genetic-marker studies (Potapov, 1969; Alexeev & Gohman, 1984; Luzina, 1987). The analysis of the tribal structure of Altaians has shown that the present-day Altaians have retained their native language and ethnic identity. They have begun to mix with other ethnic groups (mostly Russians and Kazakhs) only recently, so the interethnic admixture is estimated to be less than 5% (Luzina, 1987; Osipova et al. 1997).

The remnant of Tofalars (n = 58), a small geographically isolated tribe of nomadic hunters and reindeer breeders, occupying the Taiga area on the northern slopes of the east Sayan Mountains and numbering approximately 600 individuals, was collected in the village of Alygdzher in the Nizhneudinsk administrative district of the Irkutsk region. The Tofalars originally spoke a Samoyed language, but later changed to a Turkic-group language. They appear to be of mixed origin, but related to Eastern Tuvinians (Todjins) (Levin & Potapov, 1964).

The Tuvinian samples (n = 90) were collected in Dzun-Khemchiksk, Mongun-Taiga, Bai-Taiga, Ovyursk, Tes-Khemsk, Erzinsk and Tandinsk districts of Tuva Republic. Taking into account the ethnoterritorial differences existing among Tuvinians as well as peculiar anthropological features observed in Eastern Tuvinians – Todjins (Alexeev & Gohman, 1984), the Todjin samples (n = 48) were collected in the villages of Toora-Khem, Iij-Khem, Ulug-oo and Adyr-Kezhig in the Todja district of Tuva Republic. Most of the territory of the Tuva Republic is situated in the steppe zone in the centre of the Asian continent, bounded by the Sayan Mountains to the North and the Mongolian steppes to the South. Tuvinians number approximately 200 thousands individuals, and they still pursue traditional subsistence activities similar to those of Buryats. Buryats, who are also nomadic cattle breeders, live in the central southern part of the Siberia border to Mongolia and China (Levin & Potapov, 1964; Gurvich, 1980). They number up to 400,000 and represent the only Mongolic-speaking group in Siberia. Buryat samples (n = 91) were collected in the villages of Kizhinga, Khorinsk, Zakamensk, Eravna, Selenga, Barguzin and Kabansk districts of Buryat Republic, thus encompassing all territories inhabited by the modern Buryats.

Khakassians, the northern neighbours of Tuvinians, live in the middle reaches of the River Yenisey and in the upper reaches of its tributaries, the Abakan and the Chulym. On an administrative level, they belong to the Khakass Republic in the Kransoyarsk Region of the Russian Federation - an area of some 61,900 square kilometres. Northern and eastern parts of the region are flat steppelands (the Abakan-Minusinsk Basin), whereas southern and western regions are mountainous. The Khakassian samples (n = 53) were collected in the settlements of Askiz, Shirinsk, Beisk and Ordzhonikidzevsk districts of Khakass Republic.

Additionally, the samples from the Sojots (n = 30) were collected in the Tunka and Okinsk districts of Buryat Republic. Historically, these samples represent well-defined Turkic-speaking tribes of cattle breeders, currently numbering approximately 2000 individuals.

DNA Extraction and Sequencing

DNA was extracted from the hair roots as described elsewhere (Walsh et al. 1991). PCR amplification of the entire noncoding region was performed using the primers L15926 and H00580. The temperature profile (for 30 cycles of amplification) was 94°C for 20 sec., 50°C for 30 sec., and 72°C for 2.5 min., (Thermal Cycler 9700; Perkin Elmer, USA). The resulting amplification product was diluted 1000-fold and 4 mkl aliquots were added to an array of second-round, nested PCR reactions (32 cycles) to generate DNA templates for sequencing. The primer sets L15997/M13(-21)H16401 and M13(-21)L15997/H16401 were used to generate both strands of the HVR1. Similarly, the primer sets L00029/M13(-21)H00408 and H00408/M13(-21) L00029 were used for HVR2. The nucleotide sequences of HVR1 from position 15999 to 16400 and HVR2 from position 30 to 407 have been determined. Both primer sequences and nomenclature were used according to Sullivan et al. (1992). Negative controls were prepared for both the DNA extraction and the amplification process. PCR products were purified by ultrafiltration (Microcon 100; Amicon) and sequenced directly from both strands with a (-21)M13 primer using the BigDye Primer Cycle Sequencing Kit (Perkin Elmer) according to the manufacturer's protocol. Sequencing products were separated in a 4% PAGE gel on the ABI Prism™ 377 DNA Sequencer. Data were analyzed using the DNA Sequencing Analysis and Sequence Navigator programs (Perkin Elmer). The length polymorphisms located between 16180-16193 were disregarded from the analyses.

RFLP Analysis

Several amplified segments, mainly in the mtDNA coding regions, were analyzed by RFLP testing according to the method described by Torroni et al. (1993, 1996) and Macaulay et al. (1999), to screen haplogroup-specific sites (Table 1). The restriction fragments were resolved by electrophoresis in 8% PAGE gels and were visualized after ethidium bromide staining under UV.

Table 1.  RFLP polymorphisms used to identify mtDNA haplogroups
HaplogroupCharacteristic restriction site(s)
M+10394 DdeI, +10397 AluI
C+10394 DdeI, +10397 AluI, −13259 HincII/
  +13262 AluI
D+10394 DdeI, +10397 AluI, −5176 AluI
E+10394 DdeI, +10397 AluI, −7598 HhaI
G+10394 DdeI, +10397 AluI, +4830 HaeII/
  +4831 HhaI
A+663 HaeIII
B9-bp deletion
F−12406 HpaI/HincII
HV−14766 MseI
H−14766 MseI, −7025 AluI
U+12308 HinfI
U2+15907 RsaI
K+10394 DdeI, −9052 HaeII, +12308 HinfI
J+10394 DdeI, −13704 BstOI
T*+13366 BamHI
T1−12629 AvaII, +13366 BamHI
I−4529 HaeII, +8249 AvaII, +10032 AluI,
  +10394 DdeI
W+8249 AvaII, −8994 HaeIII
X−1715 DdeI, +14465 AccI

Phylogenetic and Statistical Analyses

The phylogenetic relationships between mitochondrial haplotypes comprising various combinations of the HVR1 sequences and RFLPs were analyzed by the median-network method (Bandelt et al. 1995) and checked by the Network 3.1 program, from the Fluxus Engineering Web site. Coalescence times for mtDNA haplogroups were estimated according to the methods of Forster et al. (1996). Sequence classification into haplogroups was based on HVR1 and RFLP data. The nomenclatures of Macaulay et al. (1999), Richards et al. (2000) and Yao et al. (2002) were followed for the West Eurasian and East Asian mtDNA clusters.

The basic parameters of molecular diversity and population genetic structure (including analyses of molecular variance, AMOVA) were calculated using the computer program Arlequin 2.0 (Schneider et al. 2000). The statistical significance of Fst-values was estimated by permutation analysis using 10000 permutations.

Mitochondrial DNA HVR1 sequences from 103 Mongolians (Kolman et al. 1996), 66 Han Chinese and 64 South Koreans (Horai et al. 1996), 55 Kazakhs, 94 Kirghizs and 54 Uighurs (Comas et al. 1998) and 263 Han Chinese (Yao et al. 2002) were used for comparative analyses.

Results and Discussion

MtDNA Composition of the South Siberian Populations

Four hundred and eighty South Siberian samples were analyzed by HVR1 sequencing and assaying additional RFLP markers (Table 1). One hundred and eighty four different HVR1 haplotypes were observed. Altogether, 150 haplotypes belong to Asian-specific mtDNA haplogroups, M*, M7, M8, M9, M10, A, B, C, D, F, G, Y, Z, N9a, R*, whereas 34 belong to West Eurasian-specific haplogroups, H, J, T, U, X, I, N1a (Table 2). The haplogroup frequencies observed in South Siberia are summarized in Table 3. The populations studied exhibit a high percentage of haplogroup M mtDNA lineages (M*, M7, M8, M9, M10, C, D, G, Z), ranging from 48.2% in Altaians to 80.2% in Buryats. Haplogroup C is the most frequent haplogroup within the Asian-specific fraction, closely followed by haplogroup D. Haplogroups C and D together account from about one third to over half of the Asian-specific fraction in all populations studied. The spread of haplogroups C and D reveals some characteristic differences among populations of South Siberia. For example, haplogroup C variants are predominantly found among populations living in the East Sayan region, Tofalars, Tuvinians and Todjins. On the other hand, the highest frequency of haplogroup D was observed among Buryats and Sojots (33% and 46.7%, respectively) living in the Baikal region.

Table 2.  HVR1 sequence variation and mtDNA haplogroup (HG) status of 480 South Siberian samples
HGHVR1 sequence110 AL53 KH91 BR30 ST48 TD90 TV58 TF
  1. Note: Variant positions from the Cambridge Reference Sequence (CRS) of Anderson et al. (1981) are shown minus 16000. Transversions are further specified by the appropriate base change. Ins indicates an insertion. Heteroplasmic variants are shown by a/. Populations are coded as: AL – Altaians, KH – Khakassians, BR – Buryats, ST – Sojots, TD – Todjins, TV – Tuvinians, TF – Tofalars. aThis haplotype has already been published (Derenko et al. 2001).

A086 223 290 319 362    1  
A183 223 274 290 319 362     1 
A189 223 290 319 362  1    
A223 242 290 293C 319      3
A223 290 292A 319 362  1    
A223 290 297 311 319 362   3   
A223 290 319 362 2  1  
B4189 217  21   
B4189 217 240  1    
B4093 145 189 217 266 362    14 
B4129 153 189 217 223 247 320  1    
B4a129 189 217 261 356     1 
B4a167 189 217 261 317T    1 2
B4a189 217 261 299     1 
B4b086 136 189 21722   1 
B4b086 136 189 217 293A/G1      
B5b111 140 189 234 243 304  1    
B5b140 189 243 2741 1    
C025 093 129 223 235 298 327 390 1     
C093 129 223 235 298 327 390      19
C129 223 235 298 327 390 1     
C223 235 298 327 390      1
C093 129 150 189 223 298 327     1 
C093 129 223 298 3276 3  6 
C093 129 223 298 327 381     1 
C093 129 223 327  1    
C129 150 178 223 298 327 2     
C129 150 223 298 327  1    
C129 140 171 223 291 298 327 344 357 1     
C129 140 171 223 298 327 344 357 2     
C167 171 223 298 327 344 3571      
C171 223 224 298 327 344 357      1
C171 223 298 327 344     1 
C171 223 298 327 344 35732  323
C223 298 327 344 357      1
C093 223 288 298 327 390    2  
C148 164 223 288 298 327 1     
C148 223 288 298 301 327     1 
C148 223 288 298 327  11455
C148 223 327    1  
C093 223 288 291 298 327 1     
C093 223 261 288 298  2    
C223 261 288 298  2 11 
C223 298 32754214105
C223 298 311 327  2 25 
C189 223 294 298 311 327    1  
C223 294 298 311 327    26 
C223 259insA 294 298 327  1    
C223 259insA 298 327  34   
C223 270 298 327  3    
C223 291 298 3274   22 
C223 293 298 327  1 1 1
C129 223 298 327 33  2 
C025 223 298 327 1     
C175 223 298 327  1    
C223 298 327 3431      
C223 242 298 3271      
D042 172 223 362  1    
D5a051 172 189 223 266 362  1    
D5a092 126 164 189 223 266 36211     
D5a092 164 172 189 223 266 362     2 
D5b126 136 189 223 3601      
D082 147A 189C/T 223 3621      
D082 147A 223 3621      
D082 223 362 31    
D092 129 148 223 271 362  1    
D092 223 316 3621      
D093 164 189 223 228 362  1    
D093 164 223 245 362  1    
D093 172 173 215 223 319 3622      
D093 223 239 243 319 362     1 
D129 145 223 311 319 362  2    
D129 173 223 319 362  1    
D093 223 232 290 362  1    
D093 223 362     1 
D129 152 179 192 223 362     2 
D147 223 362  1    
D140 223 274 311 3622      
D171 223 311 362  1    
D171T 223 355A 362  14   
D174 223 262 114   
D184C/T 223 311 362  1    
D192 223 362  1    
D218 223 362  1    
D221 223 245 362  1    
D223 232 290 362  1    
D223 245 362  1    
D223 291 3621      
D223 294 362  1    
D223 311 362  3    
D223 319 3625 11 8 
D182 223 362  1    
D223 274 362  2    
D223 362222522 
G2223 278 362  5    
G2051 150 223 278 362  1    
G2129 223 274 278 362   1   
G2145 223 278 362  1    
G2223 278 287 304 3621      
G2a003 105 107G 223 227 278 362      1
G2a189C/T 223 227 278 362  1    
G2a223 227 234 278 362  1    
G2a223 227 274 278 362  21 2 
G2a223 227 278 362  2    
G3093 223 274 362 390     1 
G3156 223 274 362 390     1 
G4223 325 362    3  
G4129 223 325 362 3651      
G4218 223 260 325 362    41 
G4223 260 325 362    2  
G4223 260C/T 325 362     1 
F1093 207 304 362 3991      
F1a129 162 172 304 399 1     
F1a162 172 304 2     
F1b189 30454     
F1b114A 189 232A 249 304 311  1    
F1b129 189 232A 249 304 311 344 1     
F1b172 179 189 232A 249 304 3113   12 
F1b179 189 232A 249 304 311 1     
F1b189 232A 249 304 311 3     
F2a092A 291 3041      
R*CRS      1
R*124 148 290 304 309 390     1 
R*145 192 243 304 309 362 390   1   
R*145 192 243 304 309 390    11 
R*051 168 172 311    1  
H304    1  
H092 245 3621      
H169 1842      
H288 36212     
H093 129 168 291      1
HCRS  2   3
H220C 235 291     1 
I129 223 3912 1    
J*069 126  2   5
J1069 126 145 172 222 261 1   4 
J1069 126 145 261 2904      
J1069 126 145 172 261 278     1 
M*145 148 188 189 223 381  1    
M7b1129 192 223 297    1  
M7c223 248 295 319  1    
M7c145 223 295 304    1  
M8184 189 223 298 355 3622      
M8a134 184 223 287 298 3191      
M8a148 223 298 3191      
M8a184 223 298 3191      
M9223 234 291 316 3621      
M9223 234 316 3621      
M10129 193 223 311 357  1    
M10093 193C/T 223 311 357 3811      
N1a147A 172 189 223 248 320 3552      
N1a147G 172 189 223 248 320 3551      
N9a111 129 223 257A 261     1 
N9a172 223 257A 261 1     
N9a189 223 257A 2612      
N9a223 248 257A 261 3113      
T*051 126 189 294 296     1 
T*126 168 294 296 324 1     
T*126 294 296  1    
T*126 294 304      3
T1126 163 186 189 2941      
U2051 129C 189 214 258 3625      
U2051 129C 189 294 362 1     
U2189 214 3621      
U33432   31 
U4311 35624     
U5a172 192 256 270 291 311 399   1   
U5a192 241 256 270 287 304 325 3991      
U5a192 256 270  1    
U5b189 261 270     1 
U5b189 270     1 
U5b192 249 3113      
K224 311   1   
X189 223 278a3      
Y126 189 231 266 311    1  
Y126 193 231 266     1 
Y126 231 266  1    
Y126 231 266 319 399  11   
Z129 185 223 224 260 298  1    
Z185 223 260 2985      
Z185 223 260 298 360      3
Z185 223 260 298 399     1 
Table 3.  MtDNA haplogroup distribution (no. of individuals and % values in parenthesis) in South Siberian populations
 Altaians (110)Khakassians (53)Buryats (91)Sojots (30)Todjins (48)Tuvinians (90)Tofalars (58)In total (480)
A02 (3.8)2 (2.2)3 (10.0)2 (4.2)1 (1.1)3 (5.2)13 (2.7)
B4 (3.6)2 (3.8)6 (6.6)1 (3.3)2 (4.2)7 (7.8)2 (3.5)24 (5.0)
M*8 (7.3)03 (3.3)02 (4.2)0013 (2.7)
C21 (19.1)19 (35.9)26 (28.6)6 (20.0)23 (47.9)43 (47.8)36 (62.1)174 (36.3)
D17 (15.5)7 (13.2)30 (33.0)14 (46.7)2 (4.2)16 (17.8)086 (17.9)
G21 (0.9)013 (14.3)2 (6.7)02 (2.2)1 (1.7)19 (4.0)
G1 (0.9)0009 (18.8)4 (4.4)014 (2.9)
Z5 (4.6)01 (1.1)001 (1.1)3 (5.2)10 (2.1)
F10 (9.1)12 (22.6)1 (1.1)01 (2.1)2 (2.2)026 (5.4)
N9a5 (4.5)1 (1.9)0001 (1.1)07 (1.5)
Y002 (2.2)1 (3.3)1 (2.1)1 (1.1)05 (1.0)
H7 (6.4)2 (3.8)2 (2.2)01 (2.1)1 (1.1)4 (6.9)17 (3.5)
U18 (16.4)6 (11.3)1 (1.1)1 (3.3)3 (6.3)3 (3.3)032 (6.7)
K0001 (3.3)0001 (0.2)
T1 (0.9)1 (1.9)1 (1.1)001 (1.1)3 (5.2)7 (1.5)
J4 (3.6)1 (1.9)2 (2.2)005 (5.6)5 (8.6)17 (3.5)
R*0001 (3.3)2 (4.2)2 (2.2)1 (1.7)6 (1.3)
X3 (2.7)0000003 (0.6)
N1a3 (2.7)0000003 (0.6)
I2 (1.8)01 (1.1)00003 (0.6)

The geographic distribution of haplogroup Z contrasts with that of its sister haplogroup C (Table 3). Among Khakassians, Sojots and Todjins haplogroup Z has not been found, as opposed to the high frequency of haplogroup C among them. Haplogroup Z is found in Altaians, Buryats, Tuvinians and Tofalars with an overall frequency of 2.1%.

Haplogroup G mtDNAs, which are widely distributed in the northeast Siberian populations of Koryaks, Evens, Chukchi and Itelmens (Derenko & Shields, 1997; Starikovskaya et al. 1998; Schurr et al. 1999), were practically absent in the majority of South Siberian populations, with the exception of Todjins where the frequency is rather high - 18.8%. In contrast, mtDNAs harbouring both G and E specific RFLPs (+4830 HaeII/+4831 HhaI for G and −7598 HhaI for E) were identified in five out of seven populations studied - Altaians, Buryats, Sojots, Tuvinians and Tofalars, with frequencies varying from 0.9% in Altaians to 14.3% in Buryats. Such ‘E/G’ mtDNA variants originated on the background of haplogroup G due to mutation at np 7600, which gives a similar E-specific RFLP pattern, and therefore should be considered as subgroup G2 within haplogroup G (Yao et al. 2002). It is noteworthy that four Buryat and three Tuvinian mtDNAs which we classified previously as E haplotypes (Derenko et al. 2000), also harbour +4830 HaeII/+4831 HhaI site gains characteristic for haplogroup G. Thus, it seems that haplogroup E has a very restricted distribution: it is virtually absent in South Siberia and occurs as rarely as 1.5–5% in Tibet and southern China (Torroni et al. 1994; Kivisild et al. 2001).

Haplogroup M sub-lineages, M7, M8, M9, M10 and M* were detected in Todjins, Buryats and Altaians with frequencies of 4.2%, 3,3% and 7.3%, respectively. Haplogroups B and F encompassing almost all East Asian R lineages (Richards & Macaulay, 2000; Yao et al. 2002) are found in South Siberian populations with considerable differences in geographic distribution. Haplogroup B mtDNAs are widely spread, although at low frequencies (ranging from 3.3% to 7.8%), among all South Siberian populations studied here. Haplogroup F has not been found in Sojots and Tofalars; it is also very rare in Buryats, Tuvinians and Todjins, whereas in Khakassians and Altaians it is found with frequencies of 22.6% and 9.1%, respectively.

Haplogroup Y, widely spread in Northeastern Asia where it is found with considerable frequencies in Evens, Koryaks, Itelmens, Nivkhs and Ainu (Horai et al. 1996; Derenko & Shields, 1997; Schurr et al. 1999), is much less frequent in South Siberia, being found in Buryats, Sojots, Tuvinians and Todjins. Haplogroup N9a, which has a predominantly East Eurasian distribution, was also found at very low frequencies among Altaians, Khakassians and Tuvinians.

Despite the fact that the majority of maternal lineages of South Siberian populations belong to East Asian specific mtDNA haplogroups, a substantial West Eurasian fraction was revealed in gene pools of the populations studied (Table 3). Lineages characteristic of West Eurasian populations were found with the highest frequency among Altaians (34.5%), Khakassians (18.9%) and Tofalars (20.7%), but are less frequent among Tuvinians, Todjins, Sojots and Buryats. Haplogroup U is the most frequent haplogroup within the West Eurasian fraction, closely followed by haplogroups H and J. A high percentage of haplogroups U and H was observed in the mtDNA pool of Altaians (16.4% and 6.4%, respectively) and Khakassians (11.3% and 3.8%, respectively), whereas the highest frequencies of haplogroups J and T (8.6% and 5.2%, respectively) were detected in Tofalars. Altaians also possess mtDNA sequences belonging to some rare West Eurasian haplogroups such as N1a, X, and I.

Thus, the mtDNA haplogroup distribution data indicate that contemporary South Siberian maternal lineages evolved largely on the basis of Asian-specific substratum, with the West Eurasian component accounting for 7%–35% of mtDNA haplotypes.

Phylogenetic Analysis of South Siberian mtDNA Lineages

To provide further insight into the variation of mtDNA haplogroups revealed in South Siberian populations, a detailed phylogenetic analysis of HVR1 sequences was performed. Moreover, in order to obtain some additional information on mtDNA classification, we determined HVR2 sequences in Altaians (n = 110) characterized by the highest level of mtDNA variability among populations studied here.

Figure 2 shows the median network of haplogroups C, Z and M8, which are defined by a transition at nucleotide position (np) 16298. According to the phylogenetic data based on HVR1 and HVR2 variation in Altaians, both haplogroup C and Z sequences are characterized by the deletion of an adenine residue at np 249 (Figure 3). In addition, based on whole mitochondrial genome sequencing data, these haplogroups share polymorphisms at nps 4715, 7196CA, 8584 (Finniläet al. 2001; Maca-Meyer et al. 2001) and therefore should be considered as sister haplogroups (Yao et al. 2002).

Figure 2.

Phylogenetic network of C, Z and M8 HVR1 sequences revealed in South Siberian populations. Populations are coded as: AL – Altaians, KH – Khakassians, BR – Buryats, ST – Sojots, TD – Todjins, TV – Tuvinians, TF – Tofalars. Circle size is proportional to the haplotype frequency in populations, number of individuals is indicated inside. Links are labelled by the nucleotide positions in HVR1 (minus 16000) to designate transitions; transversions are further specified. HVR1 mutations and RFLP variants are shown indicating nucleotide positions relative to the CRS (Anderson et al. 1981). The arrow points to the portion of the network characterized by the site gain. Insertion is designated as ins.

Figure 3.

Phylogenetic network of Altaian HVR1 and HVR2 sequences. Circle size is proportional to the haplotype frequency in population. Links are labelled by the nucleotide positions in HVR1 and HVR2 to designate transitions; transversions are further specified. HVR1 and HVR2 mutations and RFLP variants are shown indicating nucleotide positions relative to the CRS (Anderson et al. 1981). The arrow points to the portion of the network characterized by the site gain. Insertions and deletions are designated as ins and del, accordingly. Heteroplasmic variants are shown as ‘h’ sign.

In haplogroup C, the most frequent haplotype, represented by HVR1 motif 16223-16298-16327, is observed in all populations studied. This haplotype happens to be the ancestral type from which several one-step-related sequences derived. Besides that, at least three distinct clusters were observed, the first being determined by a transition at np 16129, the second by a transition at np 16288, and the third by the HVR1 motif 16171-16344-16357. The first cluster appears to be region-specific rather than population-specific, since it was detected in the majority of South Siberian populations. Interestingly, this cluster contains the Tofalar-specific HVR1 haplotype 16093-16129-16223-16235-16298-16327-16390, which occupies the external position. The high frequency of this mtDNA in Tofalars is likely due to a founder effect. The second cluster is also distributed widely in South Siberian populations, but is not found, however, in Altaians. On the contrary, the third C haplogroup cluster includes sequences from all populations studied, except for Buryats and their closest neighbours, Sojots. Meanwhile, HVR1 sequences differing from the C-root sequence by an insertion at np 16259 were found to be specific only for Buryats and Sojots. These mtDNA haplotypes have not been found in other Siberian populations, but one of them was observed among Mongolians (Kolman et al. 1996). The coalescence time of all haplotypes to the root of haplogroup C was estimated as 38400 ± 9900 years B. P., suggesting an expansion of this haplogroup before the Last Glacial Maximum.

In contrast to haplogroup C, mtDNAs from haplogroup Z are not frequent in South Siberian populations. Only four distinct haplotypes were found in the populations studied. Among them, only the Altaian sequence 16185-16223-16260-16298 matches the possible root of Z phylogeny, whereas its derivate 16129-16185-16223-16224-16260-16298 found in Buryats appears to be the most frequent Z-sequence distributed in North European (Saami [Delghandi et al. 1998]) and Northeast Siberian populations (Koryaks and Evens [Derenko & Shields, 1997]). In addition, several M8 mtDNA sister haplotypes to the CZ-haplogroup were found in Altaians. The majority of these sequences is determined by a transition at np 16319 and forms a distinct cluster, M8a (Figure 2).

Figure 4 shows haplogrops D, G, M7, M9, M10 and M*. Haplogroup D, comprising 17.9% of the combined data set, showed the greatest diversity of HVR1 sequences relative to other haplogroups. The network of haplogroup D mtDNAs was highly starlike, with a root sequence distributed widely throughout Siberia. By contrast, the majority of its derivates is found predominantly in Buryats. Additionally, at least one cluster, defined by a transition at np 16319, could be identified. The members of this cluster, including the ancestral sequence, occur mainly in Altaians, Tuvinians, and Buryats. A small, additional cluster D5, characterized by a transition at 16189 in conjunction with the lack of M-specific RFLPs (caused by a transition at 10397; Bandelt et al. 1999), was identified in the same populations. According to published data, it seems that this cluster has a very restricted distribution: it is most frequent in China but rare or absent in Central Asia and Siberia (Kolman et al. 1996; Yao et al. 2002). Cluster D5 encompasses two HVR1 motifs, 16092-16266 (D5a) and 16126-16136-16360 (D5b). The latter is found in one Altaian, and the same HVR1 sequence was previously observed in the Saami with a frequency of 4.7% (Delghandi et al. 1998).

Figure 4.

Phylogenetic network of South Siberian D, G and M7, M9, M10 and M* HVR1 sequences. Designations as shown in Figure 2.

The estimated coalescence age of haplogroup D in South Siberia is 37500±6700 years B.P., suggesting that this haplogroup evolved and expanded well before the Last Glacial Maximum.

Haplogroup G, comprising nearly 7% of the total data, exhibits at least three distinct clusters in South Siberian populations. The first cluster, defined by a transition at 16325 (G4), encompassed five different HVR1 sequences found predominantly in Tuvinians and Todjins. The second cluster determined by transitions at nps 16274 and 16390 (G3), is linked with two different HVR1 sequences also observed only in Tuvinians. By contrast, the largest cluster G2 within haplogroup G, determined by transition at np 16278 and HhaI site loss at np 7598 (characteristic for haplogroup E), was present in most South Siberian populations studied. Cluster G2 may be fairly ancient since its coalescence age is estimated as 27600 ± 12400 years B.P. In addition to its considerable age, cluster G2 has a striking phylogeographic distribution, restricted to Central and East Asia, being found in Central Asians, Han Chinese, Tibetans and Ainu (Torroni et al. 1994; Horai et al. 1996; Comas et al. 1998; Yao et al. 2002).

It should be noted that the additional haplogroup G cluster G1 was previously described in Northeast Siberian populations of Koryaks, Evens, Chukchi and Itelmens (Derenko & Shields, 1997; Starikovskaya et al. 1998; Schurr et al. 1999). Cluster G1 is characterized by a transition at np 16017, which clearly distinguishes its haplotypes from other haplogroup G mtDNA sequences described here. The presence of at least four distinct clusters within haplogroup G, as well as its obvious geographical substructuring, implies a considerable degree of divergence of these mtDNAs in Siberia.

The remaining M mtDNAs are represented in South Siberia by several minor branches occurring at very low frequencies among Altaians, Buryats and Todjins (Figure 4). The only Todjin HVR1 sequence 16129-16192-16223-16297 probably belongs to haplogroup M7b1, whereas the mtDNA haplotypes defined by 16234-16316 and 16311-16357 HVR1 motifs could be assigned to haplogroups M9 and M10, respectively (Yao et al. 2002). Two other HVR1 sequences with the 16295 motif were classified as M7c mtDNAs according to the East Asian mtDNA classification (Yao et al. 2002). The Buryat mtDNA with the 16145-16148-16188-16189-16223-16381 HVR1 sequence belongs to the still unassigned M* haplotypes.

Figure 5 shows mtDNA haplotypes classified into various haplogroups of macrohaplogroups N and R. The latter is considered as a subhaplogroup of N shared between Eastern Asians and Western Eurasians. Contrary to the widely spread M haplogroups, East Asian-specific N haplogroups A, N9a, and Y were found in South Siberian populations with frequencies less than 3%. Haplogroup A mtDNAs were found to harbour seven distinct HVR1 sequences. Most of them were identified previously in Central and East Asian populations (Kolman et al. 1996; Comas et al. 1998; Yao et al. 2002). It is noteworthy that South Siberian haplogroup A mtDNAs lack polymorphisms at nps 16111 and 16192 which are typical for Northeast Asian and New World populations.

Figure 5.

Phylogenetic network of R and N macrohaplogroup HVR1 mtDNA lineages observed in South Siberian populations. Designations as shown in Figure 2.

Four different HVR1 sequences were observed within haplogroup Y. Similar to haplogroup A mtDNAs, most of them were described earlier in Central and Eastern Asia, whereas only one HVR1 sequence found in Todjins belongs to the Northeast Asian Y-subcluster defined by a transition at np 16189.

Haplogroup N9a, which is a sister haplogroup to Y according to the East Asian phylogenetic tree (Yao et al. 2002), was found in Altaians, Khakassians and Tuvinians. All South Siberian N9a haplotypes were similar or identical to those described previously in Han Chinese (Yao et al. 2002) and Central Asians (Comas et al. 1998).

The remaining East Asian-specific sequences of South Siberian gene pools belong to the two major haplogroups within macrohaplogroup R, namely B and F. Haplogroup B, comprising 5% of the total data set, is represented in South Siberia by two major clusters, B4 and B5. Subcluster B5 as a whole has the 16189-16140 motif in association with the presence of a DdeI site at np 10394, whereas B4 is determined by the 16189-16217 motif. In the populations studied, cluster B5 encompassed only two different HVR1 sequences, having an additional transition at np 16243 which is characteristic for the B5b mtDNA haplotypes (Yao et al. 2002). In contrast, cluster B4 seems to be the most frequent, covering almost all haplogroup B mtDNAs found in South Siberia. Cluster B4 includes at least two well-represented groups of HVR1 sequences - B4a, defined by a transition at np 16261, and B4b, determined by a transition at np 16136. One should note that the South Siberian B4b mtDNAs had the np 16086C mutation. This polymorphism has been observed in all subcluster B4b mtDNAs of the Altaians, Khakassians and Tuvinians presented here, as well as in Mongolians (Kolman et al. 1996) but is absent in those from East Asian populations (Horai et al. 1996; Qian et al. 2001; Yao et al. 2002). The majority of B4a mtDNA haplotypes found in Tuvinians, Todjins and Tofalars were not found in the published data sets and appear to be unique to South Siberian populations. One exception is the Tuvininan HVR1 sequence with an additional transition at np 16299, which was found previously in a Mongol sample described by Kolman et al. (1996).

According to the East Asian mtDNA phylogeny (Yao et al. 2002) haplogroup F, which was defined originally by the loss of the HincII site at np 12406, represents only one particular branch, named as F1, within haplogroup R9. Haplogroup R9 includes both F and R9a and is identified by the deletion of one A at np 249, as well as a transition at np 10310 (Yao et al. 2002). Haplogroup F is therefore defined more widely and may be distinguished by a transition at np 16304. Haplogroup F, comprising 5% of the South Siberian data includes two clusters. Only one mtDNA haplotype from the F2a subcluster was observed in Altaians. This mtDNA, characterized by the 16092A-16291 HVR1 motif and the deletion of A at np 249 in HVR2 (Figures 3 and 5) is likely identical to that described in Han Chinese (Yao et al. 2002). The remaining F-sequences are defined by the HincII site at np 12406 and belong to the F1 cluster, which may be further subdivided into two additional subclusters. The first subcluster, F1a, defined by transitions at nps 16162 and 16172, encompassed two different HVR1 sequences found in Khakassians. The second subcluster, F1b, determined by a transition at np 16189, is linked with six different HVR1 sequences observed in the majority of populations studied here. It should be noted that only one F1b mtDNA with a 16114A variant seems to be unique for South Siberia, whereas the remaining sequences were described previously in Central and Eastern Asians (Kolman et al. 1996; Horai et al. 1996; Comas et al. 1998; Yao et al. 2002). One Altaian F1 mtDNA, having a 16093-16207-16304-16362-16399 HVR1 sequence, could not be classified as F1a or F1b. Similarly, five R* haplotypes found in Tuvinians, Todjins, Tofalars and Sojots could not be further specified (Figure 5). Evidently, some of these R* mtDNAs belong to specific subgroups (one with HVR1 motif 16304-16309-16390 and another with 16051-16168-16172-16311), the phylogenetic position of which is not clear.

The West Eurasian fraction of the mtDNA pool of South Siberians is represented by haplogroups U, J, T, and H, belonging to macrohaplogroup R and by haplogroups I, N1a, and X from macrohaplogroup N (Figure 5). Haplogroup U, which accounts for nearly 7% of the total data set, contains five clusters U2, U3, U4, U5, and K. Although cluster K seems to be non-typical for South Siberian populations, being found in only one Sojot individual, other U-clusters are widely distributed in the populations studied. In contrast to U3 and U5, which are found in the majority of South Siberian populations, clusters U2 and U4 appear to be specific to Altaians and Khakassians. Moreover, U2 haplotypes defined by a transition at np 16214 seem to be unique for Altaians as apart from one occurrence in Northern Caucasians, similar or identical haplotypes have not been found in Eurasian populations (according to the database of Richards et al. 2000). Cluster U5 showed the greatest diversity of HVR1 sequences relative to other clusters of haplogroup U. South Siberian cluster U5 encompassed six haplotypes, which belong to subclusters U5a (with 16192-16256-16270 HVR1 motif) and U5b (with 16189-16270 motif). It should be noted that the Altaian haplotype 16192-16249-16311, similar to U1 sequences, belongs nevertheless to subcluster U5b, since it is characterized by the presence of an RsaI site at np 4732 and transition at np 150 in HVR2 (Figures 3 and 5).

HVR1 sequences from haplogroup H, which is the largest in Europe, were detected in mtDNA pools of South Siberians with an overall frequency of 3.5%. Two Buryat and three Tofalar HVR1 sequences were found to be identical to CRS, whereas the remaining H-haplotypes differed from it by one to four nucleotide substitutions. As expected all Altaian H-sequences have the 73A variant in HVR2 with the exception of haplotype 16169-16184, characterized by 73G (Figure 3). Interestingly, that the same H-haplotype was found recently in the Northwest Siberian Mansi population (Derbeneva et al. 2002).

Haplogroup J, which was revealed at low frequencies in the majority of South Siberian populations, is represented by two clusters, J* and J1. The central haplotype of  J* with the 16069-16126 HVR1 sequence, was shared by two Buryats and five Tofalars and also spread in Yakuts and Evens (Derenko & Shields, 1997); it may have a wide geographic distribution, whereas J1 mtDNA haplotypes are restricted to South Siberia being found in Altaians, Tuvinians and Khakassians. Haplogroup T, which has a common origin with haplogroup J, encompassed four T* haplotypes found in Tofalars, Buryats, Khakassians and Tuvinians, and only one T1 haplotype revealed in Altaians. We have not found any identical T-sequences in the neighbouring populations of Central and East Asia, although several T* and T1 mtDNAs were observed previously in Mongolians (Kolman et al. 1996), Kazakhs, Kirghizs and Uighurs (Comas et al. 1998).

Altaians also possess mtDNAs from haplogroups X, N1a and I. Haplogroups N1a and I may have a common origin since they share transitions at nps 199 and 204 in the HVR2 (Figure 3), as well as several coding-region variants (Kivisild et al. 1999). Haplogroup N1a seems to be very rare, occurring as only 2.7% of the present data set, and being found only in Altaians. N1a-sequences have been detected so far with low frequencies in different populations of Europe and West Asia (Richards et al. 2000), but were not found in the Central and East Asian populations (Kolman et al. 1996; Comas et al. 1998; Yao et al. 2002). The same remark is true for the haplogroup I sequence, revealed in Altaian and Buryat individuals.

Haplogroup X mtDNAs found in Altaians are represented by HVR1 motif 16189-16223-16278 which was proposed to be a root sequence of the X phylogeny (Brown et al. 1998). Haplogroup X has a remarkable geographic distribution – it occurs with low frequencies in Western Eurasian populations and amongst Native Americans, but has not been found in Asians, including Siberians, suggesting that it may have come to the Americas via a Eurasian migration (Brown et al. 1998). The only exception are Altaian X mtDNA variants occupying intermediate phylogenetic positions between European and Native American X haplotypes, as it has been shown earlier (Derenko et al. 2001).

Thus, the data presented in this study demonstrate that South Siberian populations represent a complex pattern of the mtDNA structure, reflecting diverse interactions that occurred at different times between eastern and western Eurasian populations. Moreover, the South Siberian gene pool contains traces of the source of different expansions from the Central Asia/South Siberia region into the Americas and North Eurasia, reaching the northern European territories.

Sequence Diversity and Genetic Structure of South Siberian Populations

Table 4 lists some diversity parameters estimated for South Siberian HVR1 data. The nucleotide diversity ranged from 0.012 in Sojots to 0.017 in Altaians and Khakassians, while the haplotype diversity ranged from 0.867 in Tofalars to 0.991 in Buryats. When these values are compared with those of populations from Central and East Asia, it is apparent that the nucleotide diversity values of South Siberian populations were similar to those found in Central and Eastern Asian populations (0.014–0.019), while the haplotype diversity was higher in Central and Eastern Asia (0.980 - 0.999). The mean number of pairwise nucleotide differences was fairly uniform across different South Siberian groups, ranging from 4.9 in Sojots to 6.7 in Altaians (Table 4). These estimates are within the range of mean pairwise differences found in Central and Eastern Asian populations (5.5–7.6). All the mismatch distributions for the South Siberian groups were approximately bell-shaped, suggesting prehistoric population expansions. The raggedness statistic for the Siberian mismatch distributions varied from 0.006 to 0.036 (Table 4); values of raggedness index less than 0.05 are also indicative of prehistoric population expansions (Harpending et al. 1993). Assuming that the mismatch distributions did therefore reflect past population expansions, we estimated tau, the time of population expansion in units of mutational time (Table 4). The estimated tau-values varied from 2.9 to 6.5, which corresponds to estimated expansion times of 22000−49000 years ago, assuming a rate of human mtDNA divergence of 33% per million years (Ward et al. 1991).

Table 4.  MtDNA diversity parameters in South Siberian populations
PopulationSample sizeNo. of lineagesNo. of polymorphic sitesHaplotype diversityNucleotide diversityMean number of pairwise differencesMismatch observed varianceTauRaggedness index

The population structure of South Siberian mtDNA sequences was investigated by the AMOVA procedure (Excoffier et al. 1992). AMOVA showed that, when the seven populations were treated as a single group, 94.95% of the total variance was within populations and 5.05% (which was statistically significant at p < 0.001) was between populations. Eliminating Tofalars, which show reduced haplotype diversity due to possible founder effect, results in an average Fst value 2.86% for the remaining populations. Thus, overall, high level of between-population differentiation is observed in South Siberian populations.

The pairwise Fst values were statistically significant for all South Siberian population pairs, except for Altaians and Khakassians, Tuvinians and Todjinians, Buryats and Sojots, thus providing some evidence for genetic structuring. The populations were then subdivided into Turkic-speaking (Altaians, Khakassians, Sojots, Tuvinians, Todjins and Tofalars) and Mongolic-speaking (Buryats). According to the AMOVA, the fraction of genetic variance that could be attributed to language was 1.09% (not statistically different from zero) thus indicating that language does not reflect any difference in the mtDNA pool of South Siberian populations. When populations were grouped on the basis of anthropological characteristics into Central Asian (Altaians, Khakassians, Sojots, Buryats and Tuvinians) and Baikalian (Tofalars and Todjins) groups, the proportion of the genetic variance that was due to the differences between groups was again less than between populations within groups (3.32% and 3.69%, respectively). However, when the populations were classified according to geographic proximity, geography could account for 5.01% of the mtDNA genetic variance. In that case Altaians were grouped with Khakassians, Tuvinians with Todjins and Buryats with Sojots, whereas Tofalars, representing a small, geographically isolated tribe, were treated as a separate group.

In order to place the seven populations studied here in a broader geographical context, we expanded the analysis to include seven additional populations from Central and Eastern Asia: Mongolians (Kolman et al. 1996), Han Chinese and Koreans (Horai et al. 1996), Kazakh, Kirghiz and Uighur (Comas et al. 1998) and Han Chinese (Yao et al. 2002). The pairwise Fst values indicated that all seven South Siberian populations, as well as two Chinese populations, differ significantly from Eastern and Central Asian populations (Table 5). On the contrary, Mongolians were indistinguishable from Koreans, Kazakh and Kirghiz, while the two latter populations were closely associated both with each other and with Uighurs (Table 5). When applied to Central Asian populations (Kazakh, Kirghiz and Uighur), the AMOVA revealed that only 0.22% (p > 0.1) of the total variance could be attributed to differences between populations. East Asian populations of Mongolians, Chinese and Koreans exhibited a slightly higher Fst-value of 0.85% (p < 0.001), while South Siberian mtDNA differentiation was six times higher. Summarizing, South Siberian populations contain high levels of mtDNA diversity and high heterogeneity of mtDNA sequences among populations that could be due to geography, but not due to language and anthropological features. It should be noted that the pattern of South Siberian mtDNA differentiation is consistent with analysis of classical genetic markers, which reveals a high degree of heterogeneity among Siberian populations (Rychkov & Sheremetyeva, 1977).

Table 5.  Pairwise Fst-values between South Siberian, East Asian and Central Asian populations
  1. Note: aP < 0.001; b0.005 > P > 0.001; c0.05 > P > 0.005; dP > 0.05

14Chinese Han0.015a0.030a0.043a0.074a0.035a0.158a0.058a0.001d0.007c0.010b0.009c0.011b0.009c


Four hundred and eighty individuals from seven different Altaic-speaking populations of South Siberia were studied in order to estimate the mtDNA variation, and determine the relative contribution of East Asian and West Eurasian lineages to the gene pool of the present-day South Siberians. In summary, our phylogenetic analysis shows that the majority of South Siberian mtDNA sequences can be perfectly classified into specific subhaplogroups of the Eurasian founder macrohaplogroups M, N, and R. The total sample revealed as much as 81% East Asian and 17% West Eurasian contribution to the total mtDNA pool. The amount of West Eurasian ancestry varies widely and has distinct patterns in different regions of South Siberia. The highest influx of West Eurasian mtDNA lineages was observed in populations of the Altai region (18.9% in Khakassians and 34.5% in Altaians). Also, a high West Eurasian mtDNA contribution in the East Sayan populations of Tuvinians, Todjins and Tofalars was observed (ranging from 12.5% to 22.4%), whereas in Buryats and Sojots from the Baikal region it was markedly lower (less than 10%).

Such east-to-west cline in the frequencies of West Eurasian-specific mtDNA haplotypes observed in South Siberia is consistent with archaeological and paleoanthropological views about the presence of Europeoid-specific traits in inhabitants of the Altai and Sayan region since the Bronze Age. However, beginning from the early Iron Age, the presence of Mongoloid component has been increasing, becoming prevalent in modern times. Thus, dynamics of the anthropological composition of the Altai and Sayan region populations can be characterized by definitely directed replacement of the Europeoid component by the Mongoloid one (Alexeev, 1989). It should be noted in this respect that this process was nonuniform. Anthropological data demonstrate the territorial differences in the chronology of ethnogenetic processes in at least three largest groups of the Altai and Sayan region: Khakassians, Altaians, and Tuvinians. Most intensive process of Khakassians formation dates back to the end of the first millennium A.D., while the admixture of Mongoloids and Europeoids in the Altai area was completed between the first and second millenniums A.D. In Tuva, the prevalence of Europeoids can be traced back up to the pre-Mongolian time. Moreover, modern Tuvinians display anthropological features specific for southern Europeoids, whereas Khakassians demonstrate the influence of Eastern European anthropological traits (Alexeev & Gohman, 1984). On the other hand, Baikalian populations ancestral to the present-day Buryats were characterized by Mongoloid-specific anthropological features since Neolithic (Alexeev & Gohman, 1984).

Similarly, Y chromosome data revealed dual affinities of the South Siberian male lineages: they are generally characterized by a subset of southern East Asian haplotypes, while Kets, Selkups, Altaians, Shors and Khakassians also demonstrate the presence of a paleo-Europeoid component ancestral to the Native American and European Y-chromosome lineages (Santos et al. 1999; Karafet et al. 1999; Derenko et al. 2002).

The majority of South Siberian mtDNA sequences described in this study belongs to East Asian-specific haplogroups and therefore may have Central and/or East Asian roots. Moreover, the results of the present study clearly demonstrate that a subset of Asian-specific mtDNA haplogroups (A, B, C, D, F, G, Y, and Z) described previously in Siberians (Torroni et al. 1993; Starikovskaya et al. 1998; Schurr et al. 1999; Derenko et al. 2000) could be further extended by inclusion of additional M7, M8, M9, M10, N9a, F2 haplogroups that have been revealed for the first time in South Siberian populations. The considerable substructure within South Siberian haplogroups B, F, and G, together with a high degree of haplogroup C and D diversity revealed there, allows us to conclude that South Siberians carry the genetic imprint of an early-colonization phase of Eurasia. Moreover, the early presence of Europeoids in South Siberia region is confirmed by the occurrence of unique U2-16214 haplotypes in Altaians as well as by the relatively high frequency of subhaplogroup U4 revealed in Altaians and Khakassians. These U4 mtDNAs are identical to those described recently in the Northwest Siberian Mansi where they were considered as an indicator of Upper Paleolithic population of Europeans preserved in Siberia (Derbeneva et al. 2002).

Intriguingly, despite numerous historically recorded migrations and substantial gene flow across South Siberia from the Bronze Age to the present time, the high degree of between-population differentiation has been maintained, suggesting the influence of specific demographic factors. More extensive sampling of East and Central Asian populations should provide more precise and reliable information about the relationships between their mitochondrial gene pools, and might reveal continent-wide patterns in the distribution of particular haplotypes or haplogroups; this, in turn, will contribute to our understanding of the demographic history of modern Eurasian populations.


This research was supported by grants from the Russian Foundation for Basic Research (99-06-80430), the State Committee for Scientific Research (3 P04C 048 23) and Ludwik Rydygier Medical University in Bydgoszcz, Poland (BW66/02). We thank Ewa Lewandovska for her excellent technical assistance and three anonymous reviewers for helpful comments on the manuscript.