Distribution of the F374 Allele of the SLC45A2 (MATP) Gene and Founder-Haplotype Analysis

Authors


*Address for correspondence: Isao Yuasa, Division of Legal Medicine, Faculty of Medicine, Tottori University, Yonago 683-8503, Japan. Tel. +81-859-38-6123. Fax +81-859-38-6120. E-mail: yuasai@grape.med.tottori-u.ac.jp

Summary

The membrane-associated transporter protein (MATP) plays an important role in melanin synthesis. The L374F mutation in the SLC45A2 gene encoding MATP has been suggested to be associated with skin colour in major human populations. In this study more detailed distribution of the F374 allele was investigated in 1649 unrelated subjects from 13 Eurasian populations and one African population. The highest allele frequency was observed in Germans (0.965); French and Italians showed somewhat lower frequencies; and Turks had an intermediate value (0.615). Indians and Bangladeshis from South Asia were characterized by low frequencies (0.147 and 0.059, respectively). We also found the F374 allele in some East and Southeast Asian populations, and explained this by admixture. Haplotype analysis revealed that the haplotype diversity was much lower in Germans than in Japanese, and suggest that the L374F mutation occurred only once in the ancestry of Caucasians. The large differences in distribution of the F374 allele and its haplotypes suggest that this allele may be an important factor in hypopigmentation in Caucasian populations.

Introduction

Variation in skin colour is one of the most conspicuous and polymorphic traits of humans. Skin colour is primarily due to a pigment called melanin, located in the epidermis, and is lighter in more northerly latitudes. The clinal distribution of skin colour observed among indigenous peoples is correlated with ultraviolet radiation levels. This variation in skin colour has been suggested to reflect biological adaptation to such environments and is affected by natural selection. However, this view has not been universally accepted (Jablonski & Chaplin, 2000; Relethford, 2002; Jobling et al. 2004; Diamond, 2005).

Currently 127 loci associated with mouse coat colour have been identified, and 59 of the actual genes have been cloned and sequenced. Of these the mouse underwhite (uw) locus is a major determinant of pigmentation, and mutations at the uw locus also lead to hypopigmentation (Lehman et al. 2000; Bennett & Lamoreux, 2003). The SLC45A2 gene, originally called the “antigen in melanoma (AIM-1)” gene, is expressed in most human melanoma cell lines and melanocytes, but not in normal tissue, and has been cloned in medaka (a small fresh-water teleost), mouse and human. Mutations in this gene reduce the melanin content in medaka. “AIM-1” was assigned to the uw locus and the encoded protein referred to as membrane-associated transporter protein (MATP). Thus, MATP takes part in melanin synthesis in melanosomes and is suspected to function as a membrane-transporter by directing the traffic of melanosomal proteins to the melanosome (Fukamachi et al. 2001; Harada et al. 2001; Newton et al. 2001; Kushimoto et al. 2003).

The human SLC45A2 gene (Entrez Gene ID: 51151; GenBank accession numbers: NM_016180 and NT_006576) spans 40 kb on chromosome 5p and consists of 7 exons, encoding the 530-amino acid MATP polypeptide. The human SLC45A2 gene underlies oculocutaneous albinism type 4 (OCA4), for which several mutations have been identified in Germans, Japanese, and an individual of Turkish descent, (Newton et al. 2001; Inagaki et al. 2004; Rundshagen et al. 2004). Only a few polymorphisms have been identified in exons of this gene. Of these the L374F mutation, resulting from a G-to-C transversion in exon 5, has been found to have a unique distribution: the F374 allele was observed at high frequencies of 0.89 and 0.96 in white South Africans and Germans, respectively, while it has not been observed in Ghanaians, Japanese and New Guinea Islanders. The F374 allele is likely to be associated with skin pigmentation in the major human populations. The striking difference in distribution of the F374 allele may be a consequence of natural selection, i.e. adaptation to lower amounts of ultraviolet radiation (Nakayama et al. 2002; Yuasa et al. 2004). However, knowledge of the allele frequency and haplotype diversity in the gene encoding MATP is very limited. Elucidation of its genetic polymorphisms and haplotypes would serve as a starting point for molecular approaches to its anthropological genetics. In this study, we investigated i) the distribution of the F374 allele in several populations, ii) 13 additional SNPs in SLC45A2 and its immediately adjacent genes, AMACR (alpha-methylacyl-CoA racemase; Gene ID 23600) and RLN3R1 (relaxin 3 receptor 1, also known as SALPR; Gene ID 51289) in Africans, Japanese and Germans, and iii) founder haplotype(s) with the F374 allele in a German population.

Materials and Methods

DNA Samples

DNA samples were obtained from a total of 1649 unrelated individuals, mostly living in various regions of Eurasia (Fig. 1 & Table 1). The donors were selected at random, irrespective of skin colour, and their phenotypic data were unavailable. German samples were obtained from Germans living in Northrhine-Westphalia. French and Italians samples were from Rheims and Genoa, respectively. Turkish samples were collected from Turks living in West Germany as immigrants, and all subjects were born in various regions of Turkey. The DNA samples from Bangladeshis and Indonesians were described previously (Dobashi et al. 2003); Bangladeshis and Indians investigated in this study were Indo-European-speaking. Khalhas and Buryats are major and minor ethnic groups in Mongolia, respectively. Chinese samples came from three Han populations from Shenyang, Wuxi and Huizhou. Japanese from Okinawa were also studied. African samples were obtained from various Sub-Saharan Africans immigrating into Germany or Japan: 10 Africans were from Ghana and the others were from the Congo, Nigeria, Zambia, Ivory Coast, Uganda, Kenya and Zimbabwe. The samples from Munich and Tottori used for haplotype analysis are the same set as used in our previous study (Yuasa et al. 2004). The data on average pigmentation for the selected populations have been summarized in the references (Jablonski & Chaplin, 2000; Jobling et al. 2004). This study was approved by the Ethical Committee at the Faculty of Medicine, Tottori University.

Figure 1.

Location of Eurasian populations analyzed in the present study.

Table 1.  Distribution of L374F genotypes and the F374 allele in various populations
NoPopulationCollection placenGenotypes of L374FFrequency of F374References
LL/FF
1GermanNorthrhine-Westphalia2410172240.965This study
2GermanMunich9307860.962Yuasa et al. (2004)
3FrenchRheims98119780.893This study
4ItalianGenoa97519730.851This study
5TurkWest Germany2004172870.615This study
6IndianNew Delhi51371310.147This study
7BangladeshiDhaka1181041400.059This study
8KhalhaUlaan Baator1731353710.113This study
9BuryatDashbalbar1431113110.115This study
10HanShenyang8984500.028This study
11HanWuxi119119000.000This study
12HanHuizhou111110100.005This study
13JapaneseTottori103103000.000Yuasa et al. (2004)
14JapaneseOkinawa8787000.000This study
15IndonesianSurabaya105104100.005This study
16AfricanGermany/Japan1717000.000This study
White South African 54 0.89Nakayama et al. (2002)
Ghanaian 50 0.00Nakayama et al. (2002)
New Guinea Islander 52 0.00Nakayama et al. (2002)
Japanese 49 0.00Nakayama et al. (2002)

Population Study of the F374 Allele

DNA samples were typed for the L374F polymorphism by the amplified product length polymorphism (APLP) method as described previously (Yuasa et al. 2004).

Genotyping of SNPs

Many SNPs at the loci AMACR, SLC45A2 and RLN3R1 have been reported in the GenBank SNP database. In this study several SNPs which could be genotyped with restriction enzymes were arbitrarily selected (Fig. 2). DNA from 17 Africans, 103 Japanese from Tottori and 92 Germans from Munich were subjected to polymerase chain reaction (PCR) followed by restriction fragment length polymorphism (RFLP) analysis. In the African samples prior to PCR the whole genome was amplified using REPLI-g (Qiagen, Hilden, Germany) because of their small DNA volumes. Primers (Table 2) for the specific amplification of fragments encompassing mutation sites were designed on the basis of the genomic sequence (NT_006576). PCR was performed in a volume of 12 μl containing 20 ng genomic DNA, 2.5 pmol of each primer, 200 μM of each dNTP and 0.3 U HotStarTaq polymerase (Qiagen). Cycle conditions were 95°C for 15 min, then 30-35 cycles of 94°C for 30 sec, 55-56°C for 30 sec, 72°C for 40 sec, and a final extension step of 10 min at 72°C, in a GeneAmp PCR System 9700 (Applied Biosystems, Foster City, CA). Products were digested with restriction enzymes (Table 2). The digested fragments were separated by 6% or 7.5% polyacrylamide gel electrophoresis and then visualized by ethidium bromide staining.

Figure 2.

Schematic diagram of the AMACR, SLC45A2 and RLN3R1 genes, showing locations of exons and 17 SNPs. Arrows below each gene name show the direction of transcription.

Table 2.  Oligonucleotide sequence for SNP genotyping in human AMACR, SLC45A2 and RLN3R1
SNPMarker Position on NT_006576PrimerSequence (5′→3′) enzymeRestriction
  1. The primers for the SNPs 10, 11, 13 and 16 were described previously (Yuasa et al. 2004).

116459765AMACR-Y1F1CTTGGGTTCTGAGATACTGCTGTTMspI
AMACR-Y1R1CAGGAAGGGAATCCTATGGCTTT 
216458616AMACR-Y1F2TGGTTACCTGGGGTTTGTGCTRsaI
AMACR-Y1R2GCCTAGCATCAGTACCTTCACGA 
316450989AMACR-E4F2ATGCTTTGACACTTGAGTTATCTGGTsp509I
AMACR-E4R2CATCTGCTGTCCTGTAAGTCGT 
416447049AMACR-Y4F1CATATCCACACTTCACCTCATaqI
AMACR-Y4R1AACTCAGCAGAAGAATACCC 
516435295MATP-Y1F2GTTCAAAGATGTGGCCTCTGACMspI
MATP-Y1R2GTAGTCAAACGCAGTGCTTCTC 
616434674MATP-Y1FTTGTGTTTTCTGGACATTCGCTBsh1236I
MATP-Y1RCTTCCTCCTTGGGTTAGCAATC 
716430294MATP-Y2F6TATGTCCAACACCTCCCTTCTCEco130I
MATP-Y2R6TCTAAGGCCTCCAACTGACTG 
816426848MATP-Y2F2ACTGTGGTATCCTCACTCTGHsp92II
MATP-Y2R2TGCTTCAAGTGGAATGCTG 
916422112MATP-Y2F3AAACGCTTCACGGTGTGCTGAVspI
MATP-Y2R3ATTTCCTGGTCAGACCCCATGGA 
1216404484MATP-Y4FAGCAATCATGGCAGGCTTCAHsp92II
MATP-Y4RAGTCATAGGTCACCGCATCCT 
1416402809MATP-Y5FACCATTCTCAGGAGGACTCAMph1103I
MATP-Y5RGTGCTATGGCTAGGTAAAGGG 
1516400425MATP-Y5F2AGCCAAGTTGACCTGCTAGAMspI
MATP-Y5R2TGAAATCACCCAGCTTAATGCC 
1716388182SALPR-FACGTCAAAGCCGACTTTCTCCPvuII
SALPR-RGAGAGTTGAGCTGCAGGCAA 

Statistical Analysis

Allele frequencies were estimated by direct counting, and conformity to Hardy-Weinberg equilibrium was tested using the χ2 test with 1 df. Wright's FST statistic was used to estimate the proportion of variation attributable to differences in the L374F mutation among 17 populations, and in each SNP among the African, Japanese and German populations (Hartl & Clark, 1997). Linkage disequilibrium (LD) between pairs of polymorphic sites was measured using the statistics D′ (Lewontin, 1964) and r2 (Hill & Robertson, 1968). Haplotype diversity and frequency for multiple loci were estimated by the expectation-maximization (EM) method using the Alrequin program (Schneider et al. 2000).

Results

Distribution of F374 Allele in SLC45A2

Table 1 summarizes the distribution of L374F genotypes and frequencies of the F374 allele in a total of 2050 unrelated individuals from 20 populations, including previously reported data (Nakayama et al. 2002; Yuasa et al. 2004). A significant deviation from Hardy-Weinberg equilibrium was observed in the Italian and Turkish populations due to an excess of homozygotes and a deficiency of heterozygotes. Although the reasons for this are not known, this deviation may be affected by assortative mating and/or Wahlund's principle (Hartl & Clark, 1997). The results showed a marked difference in the distribution of the F374 allele among various populations. Germans had the highest estimated frequency. French and Italians showed a significantly lower frequency than Germans. Turks were characterized by an intermediate value. Indians and Bangladeshis showed a much lower frequency than Europeans and Turks. Khalhas and Buryats from Mongolia revealed similar values to each other, and were somewhat higher than Bangladeshis. In contrast, the F374 allele was much less common in Han Chinese and Indonesian populations, and completely absent in Africans and Japanese (Nakayama et al. 2002). FST was estimated to be 0.74 among the 20 populations (see Table 1).

SNP Genotyping

A total of 17 SNPs, which are scattered throughout an 80 kb genomic region containing the complete AMACR, SLC45A2 and RLN3R1 genes (Fig. 2), were investigated in three populations. Table 3 shows the frequencies of mutant alleles, which were defined in this study as minor alleles found in the African and then Japanese populations. Striking differences were observed in allele frequency among the three populations. The German population was characterized by the lowest average heterozygosity for 12 SNPs in the SLC45A2 gene. Interestingly, the heterozygosity of 4 out of 5 SNPs in the two adjacent genes, AMACR and RLN3R1, was rather higher in Germans than in Japanese. Of all the 17 SNPs the L374F mutation revealed the highest FST value (0.94). The SNPs near the L374F mutation site also revealed relatively high values in comparison with the SNPs in the 5′-flanking region of the SLC45A2 gene. Low FST values were generally observed in the two adjacent genes.

Table 3.  Frequency, heterozygosity and FST of SNPs in Africans, Japanese (Tottori) and Germans (Munich)
SNP NoSNP position on NT_006576GenBank SNP numberRegionMutationFrequency of mutant*HeterozygosityFST
African n = 17Japanese n = 103German n = 92African n = 17Japanese n = 103German n = 92
  1. *Mutant alleles are defined as minor alleles found in Africans in this study.

AMACR
 116459765rs34688intron 1c > t0.50000.08740.11960.5150.1600.2120.195
 216458616rs34687intron 1g > a0.47060.08740.11960.5130.1600.2120.172
 316450989rs2287939exon 4C > T (S201L)0.11760.14080.29890.2140.2430.4210.043
 416447049rs840409intron 4c > g0.00000.08740.07610.0000.1600.1410.029
Average heterozygosity 0.3110.1810.247 
SLC45A2
 516435295rs732740intron 1t > c0.00000.00000.00540.0000.0000.0110.004
 616434674rs250413intron 1c > t0.00000.16500.01090.0000.2770.0220.103
 716430294rs181832intron 2t > c0.32350.16500.02170.4510.2770.0430.108
 816426848rs3776549intron 2g > a0.26470.20870.04890.4010.3320.0940.058
 916422112rs3756462intron 2t > c0.00000.20870.04890.0000.3320.0940.101
 1016415976rs26722exon 3G > A (E272K)0.05880.38350.03260.1100.4750.0630.191
 1116406617rs2287949exon 4G > A (T329T)0.08820.23790.00540.1660.3640.0110.094
 1216404484rs250417intron 4g > c0.20590.64080.03260.3370.4630.0630.316
 1316403799rs16891982exon 5G > C (L374F)0.00000.00000.96200.0000.0000.0740.944
 1416402809rs40132intron 5t > c0.02940.40780.02720.0590.4850.0530.245
 1516400425rs35394intron 5a > g0.02940.40290.02720.0590.4830.0530.240
 1616396933rs3733808exon 7G > C (V507L)0.00000.00490.00000.0000.0100.0000.003
Average heterozygosity 0.1320.2920.048 
RLN3R1
 1716388182rs428685′-flankingc > g0.35290.04370.18480.4710.0840.3030.102
Average heterozygosity in three loci 0.1940.2530.110 

Pairwise LD in MATP

Pairwise LD values in the Japanese and German populations were evaluated between major SNPs in SLC45A2 by D′ and r2. Fig. 3 illustrates substantial differences between D′ and r2. The D′ values were generally much higher than the r2 values, as D′ values equal to 1.0 are caused by the presence of only three out of four possible haplotypes for a pair of loci. Although rare alleles with frequencies <5% do not have sufficient statistical power for LD detection (Lewontin, 1995; Goddard et al. 2000), the pairwise values between SNPs 10–15 were relatively high in both populations, suggesting that this 15-kb region forms a haplotype block.

Figure 3.

Pairwise LD analysis of the SLC45A2 gene region in Japanese (A) and Germans (B), evaluated by r2 and D′ (above and below the diagonal, respectively). Number shows the SNPs.

Haplotype Analysis

Haplotypes were constructed on the basis of the genotype data from 12 SNPs in SLC45A2. Table 4 lists the 32 haplotypes and their frequencies, as estimated by the EM algorithm, with phase-unknown samples. This procedure has been shown to estimate common haplotype frequencies accurately when the Hardy-Weinberg assumption is fulfilled (Fallin & Schork, 2000; Tishkoff et al. 2000). Although SNPs 8 and 9 investigated in the Japanese samples showed a slight deviation from Hardy-Weinberg equilibrium 2= 4.33; 0.05 > p > 0.02), data from these SNPs were included in the estimation of haplotypes. A total of 32 haplotypes were observed and classified into 4 groups based on SNPs 10-15. In the Japanese population 17 haplotypes were observed belonging to the 3 groups, and all major haplotypes in each group (hp1, hp15 and hp25) shared a nucleotide sequence of TCTGT for SNPs 5-9. Group 2, consisting of the F374-containing haplotypes, accounted for most of the haplotypes in Germans. The L374-containing haplotypes found in Germans were a subset of those in the Japanese. These findings indicate less allelic complexity in Germans than in Japanese. Haplotype diversity was estimated as 0.815 ± 0.051 for Africans, 0.881 ± 0.017 for Japanese and 0.261 ± 0.043 for Germans, and the mean number of pairwise differences was 1.586 ± 0.090 for Africans, 3.50 ± 1.79 for Japanese and 0.58 ± 0.47 for Germans. One major haplotype (hp9) differed by one nucleotide in SNP 13 from hp1; hp9 must have arisen following a G-to-C transversion in hp1. Minor alleles in each group occurred due to recombination events, as revealed by the presence of the same sequences in SNPs 5-9. In the German populations recombination between both F374-containing and L374-containing haplotypes must have occurred prior to the disappearance of the L374-carrying haplotypes.

Table 4.  Haplotypes and their frequencies in African, Japanese and German
GroupHaplotypeSequence of the SNPs 5-16Frequency
AfricanJapaneseGerman
1hp1TCTGTGGGGTAG0.4490.245 
hp2TCCGTGGGGTAG0.116 
hp3TCCATGGGGTAG0.109 
hp4TCTATGGGGTAG0.085 
hp5TCCGTAGGGTAG0.034 
hp6TTCGTGGGGTAG 0.056 
hp7TCTACGGGGTAG 0.0530.005
hp8TCTGTGAGGTAG 0.005 
2hp9TCTGTGGGCTAG 0.896
hp10TCTACGGGCTAG 0.039
hp11TTCGTGGGCTAG 0.011
hp12TCCGTGGGCTAG 0.005
hp13CCCGTGGGCTAG 0.005
hp14TCTGTAGGCTAG 0.005
3hp15TCTGTGACGTAG0.0480.1020.005
hp16TCTATGACGTAG0.031 
hp17TCCATGACGTAG0.010 
hp18TCTACGACGTAG 0.059 
hp19TTCGTGACGTAG 0.055 
hp20TCTGTGACGTAC 0.005 
hp21TTCACGACGTAG 0.003 
hp22TTCGTAACGTAG 0.010 
hp23TCTGTGGCGTAG0.064 
hp24TCCGTAGCGTAG0.025 
4hp25TCTGTAGCGCGG 0.2560.023
hp26TCTACAGCGCGG 0.0780.004
hp27TTCGTAGCGCGG 0.039 
hp28TCCATGGCGCGG0.029 
hp29TCTGTGGCGCGG 0.014 
hp30TCTACGGCGCGG 0.013 
hp31TCTGTGGCGCAG 0.005 
hp32TTCACGGCGCGG 0.003 
Total 1.0001.0001.000

Discussion

In this study the distribution of the F374 allele of the human SLC45A2 gene, encoding MATP, was confirmed to differ greatly among three major human populations. No other alleles like the F374 allele have been identified from many Caucasian-specific alleles observed to date. The F374 allele shows near-fixation in Germans. White South Africans showed a frequency of 0.89 due to admixture of several European ethnic groups, and contained 7% non-European genes (Nakayama et al. 2002). It is possible to estimate the allele frequency in their ancestors (p) using the model of one-way migration (Hartl & Clark, 1997):

image(1)

where p′ is the frequency in white South Africans (0.89), p* is the frequency in non-Europeans (0), and m is the present proportion of copies of SLC45A2 that were derived from non-Europeans (0.07). The allele frequency in ancestors of white South Africans (p) was calculated to be 0.957; this value is comparable to that found in Germans, suggesting that northern Europeans, or Germanic-speaking people, share high frequencies of the F374 allele. In comparison with Europeans Turks, Indians and Bangladeshis showed significantly low frequencies of the F374 allele, which gradually declined from Germany to Bangladesh. It seems that these frequencies are associated with latitude, ultraviolet radiation levels and skin colour. The F374 allele was also found in East and Southeast Asia, excluding Japanese populations and a Han-Chinese population from Wuxi. It is likely that there was gene flow from Europe and the Middle East to East Asia along the Silk Road and/or its northern and southern regions. Western Eurasian-specific haplogroups of mitochondrial DNA were observed at a frequency of 14.3% in Mongolians from Xinjiang Province, China (Yao et al. 2004). Khalhas from Mongolia and Buryats from Siberia have a fairly high frequency of mitochondrial DNA haplotypes of European origin (Pakendorf et al. 2003). Thus, extensive gene admixture has occurred between Asians and Europeans, and we suggest that the F374 allele is substantially specific to so-called Caucasoid populations, including Europeans, west and south Asians and north Africans.

The large differences in distribution of the F374 allele are comparable to that of the FY*O allele (the null allele at the Duffy blood group locus), with an FST value of 0.78 found in Sub-Saharan Africans (Cavalli-Sforza et al. 1994). Exceptionally high FST values are a potential indicator of the effects of directional selection (Lewontin & Krakauer, 1973; Bowcock et al. 1991). Because MATP plays an important role in melanin synthesis the high frequency of the F374 allele among Europeans may be a response to selection for the lower amount of solar ultraviolet radiation, and associated with depigmentation. The highest frequencies of the F374 allele in northern parts of Europe do not necessarily indicate its origin and spread. European ancestors, after expanding toward the higher latitudes, may have acquired higher frequencies of this allele to adapt to lower levels of solar ultraviolet radiation. Most of the L374-carrying haplotypes were lost during expansion. Haplotype analysis has also revealed a large difference in haplotype diversity between the Japanese and Germans. The F374-bearing haplotypes found in Germans shared the same basic sequence in SNPs 10-15 in a 15-kb region, which formed a haplotype block characterized by low haplotype diversity, strong associations between alleles, and rare recombination. These findings suggest that the L374F mutation occurred only once in the ancestry of the Caucasian group. The hp9 is probably a founder haplotype, constituting 93% of the F374-carrying haplotypes.

Relatively little is understood about the genes responsible for differences in skin, hair and eye colour among individuals within or between populations. Of several human pigmentation-related genes, the MC1R (melanocortin 1 receptor) gene also plays very important roles. Some mutations in MC1R are associated with red hair, fair skin and freckles. Low diversity in MC1R was observed in Africans, whereas the diversity increases in Europeans and, to a lesser degree, in Asians (Rees, 2004; Sturm & Frudakis, 2004). In contrast, genetic diversity in SLC45A2 was very low in Europeans in comparison to Africans and Japanese. Because the SLC45A2 gene is known to be responsible for OCA4, some mutations would be expected to generate hypopigmentation. Although no direct experimental data has been obtained to date on the effect of the F374 allele on the activity of MATP in pigmentation, the large difference in distribution of the F374 allele and its haplotypes suggests that this allele may be an important factor in hypopigmentation in Caucasian populations.

Acknowledgements

We would like to thank the late Prof. Satoshi Horai for providing us with some African DNA samples. This study was supported in part by Grants-in-Aid for Scientific Research (to I.Y. and K.U.) from the Japan Society for the Promotion of Science.

Ancillary