SEARCH

SEARCH BY CITATION

Keywords:

  • human leukocyte antigen;
  • population study

Abstract

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgment
  8. References

Sequence-based typing was used to identify human leukocyte antigen (HLA)-A, -B, -C, and -DRB1 alleles from 564 consecutively recruited African American volunteers for an unrelated hematopoietic stem cell registry. The number of known alleles identified at each locus was 42 for HLA-A, HLA-B 67, HLA-C 33, and HLA-DRB1 44. Six novel alleles (A*260104, A*7411, Cw*0813, Cw*1608, Cw*1704, and DRB1*130502) not observed in the initial sequence-specific oligonucleotide probe testing were characterized. The action of balancing selection, shaping more ‘even’ than expected allele frequency distributions, was inferred for all four loci and significantly so for the HLA-A and DRB1 loci. Two-, three-, and four-locus haplotypes were estimated using the expectation maximization algorithm. Comparisons with other populations from Africa and Europe suggest that the degree of European admixture in the African American population described here is lower than that in other African American populations previously reported, although HLA-A:B haplotype frequencies similar to those in previous studies of African American individuals were also noted.


Introduction

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgment
  8. References

Many African American individuals trace their ancestry to African slaves imported into the Americas between the 1600s and the 1800s. These slaves originated from the west coast of Africa, in a region extending from Senegal south to Angola (1). In the Census 2000, 12.9% of the US population, 36.4 million individuals, identified themselves as African Americans (2). Estimates of admixture in this population using genetic markers have estimated 15–20% European ancestry, although levels of European admixture vary in different regions of the Americas and the United States in specific (e.g. 7% in Afro-Caribbeans from Jamaica vs 12.6% in Charleston, South Carolina, vs 22.5% in New Orleans, Louisiana)(3–5). The degree of Native American admixture in the African American populations is controversial but may be as low as 1%–3%. From a human leukocyte antigen (HLA) standpoint, African American populations carry a more diverse set of alleles and haplotypes than other US populations (6–8).

HLA molecules bind both self- and foreign peptides found within the endoplasmic reticulum or the endocytic pathway of cells and transport them to the cell surface for potential recognition by T-lymphocyte antigen receptors. The characteristics of the binding grooves of individual HLA molecules control the repertoire of bound peptides and affect the immune response profiles of individuals (9). The multiple HLA genes carried within the human major histocompatibility complex are highly polymorphic, with over 700 alleles identified at the HLA-B locus alone (10). The HLA alleles are observed at different frequencies in various human populations and have been used to measure population diversity and make inferences about population history (11–14). It is believed that this variability is the result of selective pressure for immune response diversity in human populations (15). This study uses DNA sequencing to unambiguously identify HLA alleles carried by an African American population and algorithms developed in the International Histocompatibility Workshops to analyze diversity and predict haplotypes.

Materials and methods

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgment
  8. References

Sample population

The study population included 564 African American individuals consecutively recruited as volunteer donors for a bone marrow registry from January 2004 through March 2004. Because of the recruitment setting, individuals are unlikely to be related and are likely to originate from different areas of the United States. All are self-identified as African Americans.

Identification of known HLA alleles

Genomic DNA was prepared using the QIAamp 96 DNA blood kit (Qiagen, Valencia, CA). Each individual was initially typed at intermediate resolution for HLA-A, -B, -C, and -DRB1 by sequence-specific probe-based hybridization using the One Lambda LABType® SSO Kit (One Lambda, Canoga Park, CA) following manufacturer’s protocols. To identify the HLA-A, -B, and -C alleles carried by each individual, polymerase chain reaction (PCR) primers (Table 1) were used to amplify each locus as previously described (16–20). Applied Biosystems Big Dye terminator chemistry and sequencing primers listed in Table 1 were used to obtain the sequences of both strands of exons 2 and 3. DRB1 alleles were amplified and sequenced using the HLA-DRB High Resolution Typing System (Applied Biosystems, Foster City, CA). This kit includes primers to amplify specific DRB1 allele families; additional in-house PCR and sequencing primers were added when needed to obtain resolution. Reactions products were identified with Applied Biosystems Models 3700 or 3730xl DNA analyzers (PE Applied Biosystems, Foster City, CA) and sequence interpretation was done by assign software (Conexio Genomics, Applecross, Western Australia). Alleles identical in exons 2 and 3 (class I) or exon 2 (DRB1) (10) were not resolved except for B*180101 vs B*1817N alleles (16). For those class I samples yielding alternative allele combinations (10), either allele-specific sequencing primers or allele-specific PCR amplification was used to link polymorphisms and to identify the specific allele combination (10). (In-house primer sequences used for all loci are available at www.dodmarrow.org.)

Table 1.  DNA amplification and sequencing reagents used for HLA-A, -B and -C identification
 Locus-specific PCR primersa
 Forward PCR primer (5′–3′)Reverse PCR primer (5′–3′)
HLA-A5A2: CCC AGA CGC CGA GGA TGG CCG3A2: GCA GGG CGG AAC CTC AGA GTC ACT CTC T
HLA-B5B3: GGG TCC CAG TTC TAA AGT CCC CAC G3B1: CCA TCC CCG GCG ACC TAT AGG AGA TG
5B1: GCA CCC ACC CGG ACT CAG AAT CTC CT3B1-AC: AGG CCA TCC CGG GCG ATC TAT
HLA-C5Cin1-61: AGC GAG GKG CCC GCC CGG CGAb3BCin3-12: GGA GAT GGG GAA GGC TCC CCA CT
 Sequencing primersc
 Exon 2 (5′–3′)Exon 3 (5′–3′)
  • HLA, human leukocyte antigen; PCR, polymerase chain reaction.

  • a 

    Most of the locus-specific PCR primers have been described (17, 18, 21). For the HLA-B locus, two sense PCR primers, 5B3 and 5B1, were used in a ratio of 1:1 for the amplification of both alleles at the HLA-B locus; 5B1 is required to obtain amplification of B*510102, which differs in the 5B3 primer site. Two antisense PCR primers, 3B1 and 3B1-AC, were used in a ratio of 2:1 for the amplification of the HLA-B locus; 3B1-AC is required to obtain amplification of B*73 alleles, which differ in the 3B1 primer site (19).

  • b

    K, G or T; S, C or G; Y, C or T; and R, A or G.

  • c Some of the sequencing primers have been described previously (20, 21).

HLA-A5AIn1-46: GAA ACS GCC TCT GYG GGG AGA AGC AA5In2-148: GTT TCA TTT TCA GTT TAG GCC A
3In2-65: TCG GAC CCG GAG ACT GTG3AIn3-66: TGT TGG TCC CAA TTG TCT CCC CTC
AINT1F: GCG CCK GGA GGA GGG TINT2F: TTA CCC GGT TTC ATT TTC AG
INT2R: GGA TCT CGG ACC CGG AGAINT3R: TCC TTG TGG GAG GCC AG
HLA-B5Bin1-57: GGG AGG AGC GAG GGG ACC SCA GBex3F: GGK CCA GGG TCT CAC A
Bex2R: CAC TCA CCG GCC TCG CTC TGGBin3-37: GGA GGC CAT CCC CGG CGA CCT AT
HLA-CCex2F: GGG TCG GGC GGG TCT CAG CCCex3F: TGA CCR CGG GGG CGG GGC C
Cex2R: GGA GGG GTC GTG ACC TGC GC3BCIn3-12: GGA GAT GGG GAA GGC TCC CCA CT

Characterization of new HLA alleles

The variant HLA-A allele was isolated by group-specific amplification using primers (5AIN1-46, AIN1-A, AIN1-G, AIN1-T, and 3AIN3-62) as previously described (21, 22). Sequencing of exons 2 and 3 used primers AINT1F, INT2R, INT2F, and AINT3R (Table 1). The HLA-C loci from cells carrying variant alleles were amplified with HLA-C locus-specific primers described in Table 1, and individual alleles isolated by cloning using the TopTA vector (Invitrogen, Carlsbad, CA). Two to three individual clones carrying each new HLA-C allele were obtained and sequenced using primers described in Table 1. Exon 2 from the new HLA-DRB1 allele was amplified by the PCR using intron primers (I1RB9/I2RB28) as previously described (23). Sequencing of the second exon was performed using sense primers (I1-RBSeq1and 3) (23) and an antisense primer I1-RBSeq4 (24). DNA sequence analysis of PCR products was all carried out in both 5′- and 3′-directions for at least two independent PCR reactions in an ABI 3730 Automated DNA Sequencer (Applied Biosystems). Allele designations were assigned by the WHO Nomenclature Committee for Factors of the HLA System (25).

Statistical analysis

PyPop (Python for Population genetics, http://www.pypop.org) was used to carry out all the following analyses (26, 27). Allele frequencies were obtained by direct counting. Allele frequencies at each HLA locus were evaluated for deviations from Hardy-Weinberg equilibrium proportions using the exact test of Guo and Thompson (28) and by chi-square testing when expected values were ≥5. Chi-square tests were investigated for overall common genotypes (those expected to be seen in at least five instances), ‘lumped’ genotypes (the set of all genotypes individually expected to be seen in fewer than five instances each), all heterozygotes, all homozygotes, and individual common and heterozygote genotypes. These Hardy-Weinberg tests measure the degree to which observed genotype frequencies differ from those expected based on the allele frequencies for that population, assuming that the population is suitably large and experiences random mating (29).

The Ewens–Watterson test of homozygosity was applied to each locus (30, 31), using Slatkin’s Monte-Carlo implementation of the exact test (32, 33). In this test, the observed homozygosity (F, the sum of the squares of the allele frequencies) is compared with the mean value of F expected for a population of the same size with the same number of alleles, undergoing neutral evolution. The normalized deviate of F (Fnd, the difference between the observed and the expected values of F, divided by the square root of the variance of the expected F) was also calculated for each locus (34).

Two-, three-, and four-locus haplotype frequencies were estimated using the iterative expectation maximization (EM) algorithm (35, 36). Linkage disequilibrium (LD) between alleles at each pair of loci, and two overall (locus-pair-level) measures of LD, normalized to values between 0 and 1, were calculated. The normalized allele-pair-level LD measure, D′ij, is the disequilibrium coefficient (D), divided by the upper and lower bounds of D for the particular alleles at each locus [as described in (37–39)] and ranges from +1 to −1. A D′ij value of 0 indicates linkage equilibrium, while a value of +1 indicates the complete association of a given pair of alleles in a single haplotype, and for the data reported here, a value of −1 indicates the complete absence of a haplotype comprised by those alleles. (Note: The complete absence of a particular haplotype can only be inferred from a D′ij value of −1 when none of the reported alleles has a frequency greater than 0.5.) The first of the locus-pair-level measures, D′(37), uses the products of the allele frequencies at each locus to weight the LD contribution of specific allele pairs, while the second, Wn (40), calculates a normalization of the chi-square statistic for deviations between observed and expected haplotype frequencies. The significance of the overall LD between any two loci was tested using the permutation distribution of the likelihood ratio test (36).

Arlequin v3.0 (41) was used to compare the HLA-A:B haplotypes and HLA-C and DRB1 genotypes in this population with those in the sub-Saharan African populations from Kenya (29, 42), Mali (42), Rwanda (29), Senegal (29), South Africa (29), Uganda (42), Zambia (42), and Zimbabwe (29); the European populations from Croatia, the Czech Republic, Finland, Georgia, Northern Ireland, and Slovenia (29); an African American population (43); a European American population (43); an Afro-Cuban population; and a Euro-Cuban population (29) by calculating pairwise Fst values (and associated P values) for this entire set of populations. Fst is a measure of the genetic differentiation over subpopulations. Because all populations had not been genotyped at the same loci or for the same level of resolution, three comparisons were performed and the analysis focused on the amino acid sequences encoding the polymorphic antigen-binding groove. A given pair of population datasets was determined to differ significantly if the appropriate P value associated was less than 0.05.

Results

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgment
  8. References

Sequence-based typing strategy

Class I locus-specific amplification was followed by sequencing of both forward and reverse strands of exons 2 and 3. Of the 564 individuals tested, many required the use of additional reagents to resolve alternative combinations of alleles at HLA-A, -B, and -C loci. Approximately 14% of HLA-A, 20% of HLA-B, and 16% of HLA-C typing results required either additional sequencing primers or a group-specific amplification and sequencing to resolve the typing result to a single combination of two alleles. Separate amplification of allele groups was used to obtain partial DRB1 exon 2 sequences, and 12% of DRB1 typing results required additional PCR or sequencing reagents to resolve alternative combinations. Additional information from the intermediate resolution probe-based testing was used to resolve a small per cent of the ambiguities (HLA-A 3%, HLA-B 2%, HLA-C 2%, and HLA-DRB1 1%).

Allele and genotype frequencies

The genotype frequencies of all four HLA loci were in Hardy-Weinberg equilibrium [Guo and Thomson P values = 0.0670 (HLA-A), 0.4032 (HLA-B), 0.6587 (HLA-C), 0.2818 (HLA-DRB1)]. In total, there were 563 unique HLA-A, -B, -C and -DRB1 phenotypes among the 564 individuals. The number of observed heterozygotes did not differ significantly from that expected under Hardy-Weinberg equilibrium (Table 2). The majority of alleles observed in the population had been reported in the 1996 HLA Nomenclature Report (e.g. 83% of the HLA-A alleles observed in this study were in the report; 91% HLA-B, 73% HLA-C, and 93% HLA-DRB1) (44). More recently reported alleles included A*0260,*7409; B*0812, B*3528, Cw*0210, DRB1*01010102; and DRB1*030502. Only a small number of the known alleles (based on the April 2005 ImMunoGeneTics/HLA database release) were identified at each locus (Table 3). For HLA-A, only 42 (11%) of 368 known alleles with unique exons 2–3 sequences were observed in the African American population. Twelve alleles, present at ≥3%, contribute 76% of the allele frequency. Two alleles exhibited allele frequencies over 10%, A*0201G1 (12.1%) and A*2301G (10.8%) (Table 3 describes the ‘G’ nomenclature). For HLA-B, 67 (10%) of 665 known alleles were identified. Twelve alleles present at ≥3% contributed 65%. Only B*530101 had a frequency greater than 10% (11.8%). For HLA-C, 36 (20%) of 178 were observed. Ten alleles, present at ≥3%, contribute to the majority of the total frequency (81%). Two alleles were very frequent: Cw*0401G1 at 19.6% and Cw*0701G1 at 11.9%. For HLA-DRB1, 45 (11%) of 412 unique exon 2 sequences were observed. Twelve HLA-DRB1 alleles present at >3% contributed 77% of the allele frequency. DRB1*1503 was present at 12.2%.

Table 2.  Heterozygosity at the HLA-A, -B, -C and -DRB1 loci in an African American population
LocusNumber of subjects (n)Number of unique alleles identified (k)aObserved heterozygotesExpected H-W heterozygotesP
  • a 

    In PyPop, n and k are variables.

HLA-A56442533529.260.8708
HLA-B56467546537.960.7289
HLA-C56436530512.850.4488
HLA-DRB156445529529.590.9797
Table 3.  HLA allele frequencies in 564 random African American individualsa
HLA-AFrequencyHLA-BFrequencyHLA-CFrequencyHLA-DRB1Frequency
  • a

    Alleles were assumed to be homozygous if the typing was consistent with a single allele in both intermediate and high-resolution testing.

  • b 

    Alleles that were identical in exons 2 and 3 were not distinguished with the exception of B*180101 vs B*1817N (16). These alleles are indicated by the addition of the letter ‘G’ to the allele name. For example, A*010101G includes A*01010101 and all alleles identical to it as listed at http://www.ebi.ac.uk/imgt/hla/ambig.html(10). The groups and the primary allele in each group include A*0201G1, A*02010101; A*0211G, A*0211; A*030101G1, A*03010101; A*110101G, A*110101, A*2301G, A*2301; A*2402G1, A*24020101; A*2601G, A*260101; A*680102G, A*680102; A*7401G, A*7401; B*0705G, B*070501; B*0801G, B*080101; B*150101G1, B*15010101; B*2705G1, B*270502; B*3501G1, B*350101; B*3901G1, B*39010101; B*4001G1, B*400101; B*400201G, B*400201; B*400601G1, B*40060101; B*4402G1, B*44020101; B*4501G, B*4501; B*5101G1, B*510101; B*8101G, B*8101; Cw*030401G, Cw*030401; Cw*0401G1, Cw*04010101; Cw*050101G, Cw*050101; Cw*0701G1, Cw*070101; Cw*070201G1, Cw*07020101; Cw*070401G, Cw*070401; Cw*120201G, Cw*120201; Cw*150201G, Cw*150201; Cw*1505G1, Cw*150501; Cw*1701G, Cw*1701; Cw*1801G, Cw*1801; and DRB1*120101G, DRB1*120101.

  • c 

    Or Cw*010202.

010101Gb0.044330702010.07447010201c0.010640101010.02216
01020.007090705G0.007980202020.016840101020.00089
0201G10.121450801G0.039890202050.000890102010.04344
02020.0292608120.0008902100.0576201030.00177
02050.0221613020.0106403020.009750301010.07004
0211G0.0008914010.006210303010.015960302010.07358
0220010.000891402010.02660303040.000890305020.00089
02220.0008914030.00443030401G0.025710401010.02571
02600.00177150101G10.011520304020.0248204020.00089
030101G10.0859915030.049650401G10.195920403010.00177
110101G0.0133015100.0274804030.0008904040.00355
2301G0.1081615160.018620404010.000890405010.00709
23050.000891517010.0079804070.002660407010.00532
2402G10.0203915180.00089050101G0.0328004090.00089
24070.0017715250.0008906020.0851104110.00089
2501010.0044315310.0008906080.001770701010.09043
2601G0.014181801010.032806090.000890801010.00798
2601040.0008927030.003550701G10.118790802010.00177
26120.000892705G10.00887070201G10.072700804010.05496
290101010.0017727060.00089070401G0.0070908060.00266
2902010.038123501G10.064720801010.0008908110.00177
30010.0824535030.0026608020.042550901020.02748
30020.0576235050.0017708040.015961001010.01684
30040.0017735080.0008908130.000891101010.01507
3101020.0106435280.00089120201G0.001771101020.08333
32010.0141835430.000891203010.0088711020.03989
33010.0248237010.006211402010.014181104010.00266
3303010.0505338010.000891402030.000891104020.00177
34020.039893901G10.0026614030.00089120101G0.03191
34030.0008939030.00089150201G0.006211202010.00177
36010.031033906020.003551505G10.013301202020.00177
66010.0133039100.002661601010.093971301010.0594
66020.007094001G10.0106416080.000891302010.06738
66030.00177400201G0.005321701G0.081561303010.03457
6801010.02926400601G10.0008917040.0008913040.01241
680102G0.0070940160.000891801G0.033691305020.00089
68020.0469941010.00532 13110.00089
69010.0008941020.01064 13310.00089
7401G0.0505341030.00177 1401010.01330
74090.0017742010.05851 14020.00266
74110.0008942020.00887 14040.00177
80010.007094402G10.01862 1501010.02482
 4403010.05053 1502010.00177
 4403020.00798 15030.12234
 44050.00089 1602010.01596
 44100.00177 
 4501G0.04699 
 4701010.00089 
 48020.00089 
 49010.02039 
 50010.00621 
 5101G10.01064 
 51090.00089 
 5201010.00177 
 5201020.01773 
 5301010.11791 
 5501010.00621 
 56010.00266 
 5701010.00266 
 57020.00089 
 5703010.04078 
 57040.00709 
 58010.03191 
 58020.04167 
 78010.01241 
 8101G0.02128 
 82010.00177 

Ewens–Watterson homozygosity test

The Ewens–Watterson homozygosity statistic (F) and the normalized deviate of F (Fnd) were used to infer selective pressures acting on individual loci in the population (30, 31). The Ewens–Watterson model predicts a value of F for a population of a given size and showing a given number of alleles at a locus that is evolving in a neutral fashion. The value of the normalized deviate of F for such a population is 0. Observed Fnd values significantly greater than 0 are consistent with the operation of directional selection at that locus, while observed Fnd values significantly lower than 0 are consistent with the operation of balancing selection. Thus, significantly more ‘even’ than expected allele frequencies are consistent with balancing selection, while significantly more ‘skewed’ frequencies are consistent with directional selection or extreme demographic events (e.g. a population bottleneck) (29).

Results of the Ewens–Watterson homozygosity test are shown in Table 4. Negative Fnd values were observed at all four loci, and the Fnd values for the HLA-A and -DRB1 loci were significantly low (P= 0.0227 for HLA-A and 0.0408 for DRB1), consistent with the action of balancing selection at the these loci. In addition, the application of a sign test to these Fnd values reveals an overall deviation (P value = 0.0455) from the expectation of neutral evolution (Fnd= 0), suggesting the action of balancing selection in shaping allelic diversity at all four of these loci.

Table 4.  Ewens–Watterson homozygosity test of neutrality
LocusObserved FExpected FNormalized deviate of F (Fnd)P
  • a

    Significant at the 5% level.

  • F, the sum of the squares of the allele frequencies.

HLA-A0.06160.1048−1.23790.0227a
HLA-B0.04620.0608−0.87410.1472
HLA-C0.09070.1234−0.76720.1987
HLA-DRB10.06100.0969−1.14560.0408a

New alleles

Six individuals carried novel alleles at the HLA-A, HLA-C, or HLA-DRB1 loci (Table 5). Two new alleles, A*260104 and DRB1*130502, exhibited synonymous substitutions. The former exhibited a unique substitution at a conserved position (codon 89), and the latter is a common alternative at codon 90 (ACA/ACG). Four alleles, A*7411, Cw*0813, Cw*1608, and Cw*1704, carried single, non-synonymous substitutions. The substitutions found in three of the alleles, A*7411, Cw*1608, and Cw*1704, were in conserved positions. Cw*0804 and Cw*0813 differ in that the more common arginine is found at codon 35 in Cw*0813. While none of these six variants showed novel sequence-specific probe hybridization patterns when typed with the LabType SSO kit, all were identified later by DNA sequencing.

Table 5.  New alleles identified during study
New alleleaMost similar known alleleNucleotide substitutionbCodon/amino acid substitutioncGenBank accession numberCell
  • a 

    Allele designations were assigned by the WHO Nomenclature Committee for Factors of the HLA System (25).

  • b

    Nucleotide of the previously reported allele is listed first. Differences are underlined.

  • c

    Codon 1 encodes the first amino acid of the mature polypeptide, and the amino acid encoded by the previously reported allele is shown first.

A*260104A*260101GAG>GAA89/silentDQ086782, DQ086783NT00587
A*7411A*7401CCG>TCG47/P>SDQ086788, DQ086789NT00585
Cw*0813Cw*0804CAG>CGG35/Q>RDQ105583, DQ105584NT00600
Cw*1608Cw*160101CCG>TCG50/P>SDQ105585, DQ105586NT00599
Cw*1704Cw*1701GGG>GCG56/G>ADQ135947, DQ135948NT00602
DRB1*130502DRB1*130501ACA>ACG90/silentDQ135944NT00610

Haplotypes

Of the 378 A:B haplotypes predicted by the EM algorithm, 105 were found in three or more copies (Table 6). The most common haplotype was A*3001,B*4201, with a frequency of 0.03388. Nineteen other A:B haplotypes were found at frequencies greater than 0.01. Many of the 20 haplotypes share alleles. For example, B*5301 is found associated with four HLA-A alleles and A*2301G is associated with four HLA-B alleles. Of the 162 C:B haplotypes, 69 were found in three or more copies. The most common haplotype was Cw*0401G1:B*530101, with a frequency of 0.10178. Twenty four other C:B haplotypes were observed at frequencies greater than 0.01, and two of these at frequencies greater than 0.05. Of the 373 B:DRB1 haplotypes, 105 were found in three or more copies. The most common haplotype was B*4201:DRB1*030201, with a frequency of 0.03861. Thirteen other B:DRB1 haplotypes had frequencies greater than 0.01.

Table 6.  HLA two-locus haplotypesa in 564 African American individuals identified in three or more individuals
A:B haplotypeFrequencyD′ijC:B haplotypeFrequencyD′ijB:DR haplotypeFrequencyD′ij
  • a 

    Alleles that were identical in exons 2 and 3 were not distinguished with the exception of B*180101. These alleles are indicated by the addition of the letter ‘G’ to the allele name. A list of these alleles is given in Table 3 and can be found at http://www.ebi.ac.uk/imgt/hla/(10).

3001:4201:0.033880.541230401G1:530101:0.101780.829844201:030201:0.038610.63285
3601:530101:0.023690.731791701G:4201:0.057620.9835530101:080401:0.022240.32506
030101G1:070201:0.022710.23953070201G1:070201:0.053170.70974530101:110102:0.020990.15192
010101G:0801G:0.01850.438770210:1503:0.046990.943150801G:030101:0.018430.42148
7401G:1503:0.015830.282690401G1:3501G1:0.043020.58302440301:1503:0.017830.26274
2301G:1503:0.014210.19960602:5802:0.038950.92868440301:070101:0.015650.24117
330301:530101:0.01340.166870401G1:440301:0.03240.55384530101:1503:0.015430.00971
2301G:4501G:0.013370.19774160101:4501G:0.031030.62515530101:130201:0.013960.10117
0201G1:4402G1:0.013070.660670701G1:0801G:0.02480.570561503:070101:0.013770.20554
0201G1:4501G:0.013030.177430802:140201:0.023940.89556570301:130301:0.013450.36307
2301G:070201:0.012770.071010701G1:5801:0.023250.69193070201:150101:0.013020.48635
680101:5802:0.012380.398190701G1:570301:0.023210.511081503:1503:0.011290.11966
330301:1516:0.011510.5977030402:1510:0.023050.926533501G1:030201:0.011170.10684
0201G1:070201:0.011250.033670701G1:4901:0.01950.95066070201:110102:0.010160.058
6802:530101:0.011140.13522160101:520102:0.017731180101:030101:0.009970.25137
0201G1:3501G1:0.011010.05542050101G:4402G1:0.015070.803065801:1503:0.009830.21158
2301G:530101:0.01079−0.153531801G:570301:0.014180.396445802:120101G:0.009480.2664
030101G1:3501G1:0.010650.08589160101:3501G1:0.012820.11489070201:090102:0.009420.28982
3402:3501G1:0.01050.212120602:4501G:0.012410.19575802:1102:0.00940.20244
7401G:570301:0.010320.21328140201:1516:0.012410.87263070201:1503:0.009290.0027
0201G1:530101:0.00989−0.309661505G1:070201:0.011520.855923501G1:1503:0.008460.00947
3001:530101:0.00962−0.01028160101:7801:0.011520.92116530101:010201:0.008060.0766
2301G:440301:0.009510.08964050101G:180101:0.010640.30141180101:070101:0.007920.16596
290201:440301:0.008440.180804:8101G:0.010640.659425802:130101:0.007740.13425
3001:4202:0.007950.887281701G:4102:0.0106411503:110102:0.007570.07537
0205:530101:0.007780.26419030401G:0801G:0.009750.353524901:1503:0.007490.27928
0201G1:180101:0.007230.11282030401G:4001G1:0.009750.91447140201:1503:0.007440.17932
290201:530101:0.007110.077871801G:8101G:0.009750.439450801G:1304:0.007090.55362
3301:7801:0.007090.56052160101:440301:0.009250.09834570301:130101:0.007050.12063
3002:140201:0.006910.214610701G1:070201:0.008890.000683501G1:010101:0.006960.26663
3402:530101:0.00680.059510602:530101:0.00805−0.19775520102:130101:0.006830.34616
6802:070201:0.006760.075040602:1302:0.007980.726741516:010201:0.006790.336
030101G1:4201:0.006710.031471701G:4202:0.007980.891124501G:110102:0.006440.05866
6802:1510:0.006220.18825030301:150101G1:0.007090.609151510:030101:0.00630.17125
0202:570301:0.006110.17527070201G1:0705G:0.007090.880184201:080401:0.006240.05841
030101G1:440301:0.00590.033611801G:5704:0.007091530101:130301:0.006220.07037
030101G1:5802:0.005620.05337050101G:151701:0.006210.77024150101G1:040101:0.006210.52628
2301G:3501G1:0.00552−0.210760701G1:440302:0.00620.746781503:130201:0.00610.05948
330301:3501G1:0.005490.046920302:5801:0.00560.560278101G:120101G:0.006040.26047
3001:440301:0.005470.02821070401G:180101:0.005320.74152530101:070101:0.00601−0.43641
3002:440301:0.005370.051670802:1401:0.005320.850794501G:070101:0.0060.04098
2301G:8101G:0.005320.15905020202:180101:0.005310.291843501G1:130201:0.005870.02498
6602:5801:0.005320.74176070201G1:0801G:0.004460.042123501G1:1102:0.005690.08327
6601:5802:0.005270.370180102:2705G1:0.004430.494624201:130201:0.005690.03209
0201G1:150101G1:0.005230.37863030301:440301:0.004430.23934520102:130301:0.005590.29047
3402:440301:0.005140.08251030301:550101:0.004430.709658101G:1503:0.005590.15986
0201G1:4201:0.00496−0.30180602:3701:0.004430.687714501G:010201:0.005540.08445
680101:530101:0.004850.054280802:1403:0.004431180101:110102:0.005340.0867
2301G:0801G:0.004780.013160101:1516:0.004430.158994501G:130101:0.004990.04982
3002:180101:0.004750.09250602:5001:0.00440.682825801:110102:0.004970.0791
330301:1510:0.004710.127190602:3501G1:0.00356−0.35339140201:030101:0.00490.12279
330301:4201:0.004560.03369070201G1:390602:0.0035513501G1:010201:0.004830.04972
030101G1:140201:0.00450.090980701G1:180101:0.00355−0.08996570301:1503:0.00474−0.0499
030101G1:5704:0.004430.589720701G1:530101:0.00291−0.792251510:080401:0.004570.11764
3402:8101G:0.004430.175440401G1:5802:0.00272−0.666974501G:130201:0.004520.03096
3002:530101:0.00436−0.358370401G1:180101:0.00267−0.58426530101:1102:0.0045−0.04278
010101G:530101:0.00435−0.168390102:5601:0.002661070201:010101:0.004430.13563
3002:1510:0.004310.10514020202:2705G1:0.002660.288014001G1:040101:0.004430.40127
2601G:0801G:0.004280.27268020202:400201G:0.002660.491435801:070101:0.004210.04564
0205:5801:0.004250.164890210:180101:0.002660.024897801:110102:0.004210.27888
3002:5802:0.004190.04565030401G:150101G1:0.002660.210474402G1:040101:0.003910.18923
7401G:180101:0.004120.079130602:570101:0.0026614501G:030101:0.00390.01394
0201G1:5101G1:0.003950.28444070201G1:3901G1:0.0026613501G1:130101:0.00379−0.0141
2301G:5801:0.003940.016990802:4101:0.002660.47778570301:070101:0.003770.00217
0201G1:2705G1:0.003890.361370804:1302:0.002660.237841510:130101:0.003760.08248
2301G:1510:0.003770.032640804:1510:0.002660.14312530101:130101:0.00376−0.46338
3002:570301:0.003770.03705120301:3910:0.002661570301:030101:0.003630.0205
0201G1:520102:0.003660.09695150201G:5101G1:0.002660.422434501G:030201:0.00360.00319
0102:4901:0.003550.48959160101:5101G1:0.002660.17221070201:1102:0.003490.01403
2402G1:3501G1:0.003550.11675 1510:1503:0.003490.00527
3001:520102:0.003550.12812 3501G1:030101:0.00348−0.23145
330301:1503:0.003550.02201 4202:080401:0.003470.35628
3001:070201:0.00346−0.43614 5001:070101:0.003460.51423
0202:4501G:0.003450.07451 070201:130201:0.00339−0.32523
6802:5802:0.003440.03724 4102:130301:0.003360.29107
290201:4901:0.003390.13303 0801G:110102:0.00314−0.05572
7401G:3501G1:0.003270.00007 140201:130201:0.00310.05281
010101G:070201:0.00317−0.0398 530101:030101:0.00309−0.62612
0201G1:4901:0.003160.03792 5802:130201:0.003040.00597
3002:3501G1:0.00313−0.16015 0801G:130201:0.0030.00839
030101G1:8101G:0.003060.06322 1516:1503:0.00290.03829
0201G1:140201:0.00305−0.05479 070201:110101:0.002770.11783
6802:570301:0.003050.02924 1503:090102:0.002710.05158
7401G:140201:0.003050.06765 5802:110102:0.00271−0.21936
3002:5801:0.002840.03334 530101:120101G:0.00269−0.28452
010101G:570301:0.002720.02329 5801:1102:0.002690.04639
3001:570301:0.00272−0.19166 070201:030101:0.00266−0.49005
0201G1:0801G:0.00271−0.43998 1302:010101:0.002660.233
6601:1503:0.002690.16023 1302:070101:0.002660.17544
010101G:3701:0.002660.40207 1403:080401:0.002660.57674
0202:1516:0.002660.11703 1510:070101:0.002660.00698
0202:4901:0.002660.10423 151701:1102:0.002660.30563
030101G1:4001G1:0.002660.17944 3701:100101:0.002660.41878
110101G:550101:0.002660.42087 440301:160201:0.002660.12232
2301G:520102:0.002660.04692 440302:110102:0.002660.27273
2402G1:150101G1:0.002660.21476 440302:1503:0.002660.2404
2402G1:390602:0.002660.7448 4402G1:070101:0.002660.05764
2402G1:4501G:0.002660.08756 5101G1:080101:0.002660.32616
3301:4102:0.002660.23091 5101G1:130101:0.002660.20264
3301:4501G:0.002660.06312 5704:160201:0.002660.36486
6802:0801G:0.002660.02065 8101G:080401:0.002660.07411
6802:440302:0.002660.30047 0705G:090102:0.002640.31175
8001:180101:0.002660.3538 0801G:1503:0.00264−0.45856
0201G1:4001G1:0.002640.14395 4102:090102:0.002620.22518
2301G:5802:0.00263−0.41647 5802:070101:0.00262−0.30376

LD measures of the strength of association between pairs of HLA loci showed, as previously reported (45), that B:C associations are stronger than the associations between other loci, but all pairwise associations are statistically significant (P < 0.0001) (Table 7). Table 6 shows the relative LD (D′ij) value for each two-locus haplotype. Values can range from −1.0 to 1.0. For the data reported here, values over 0.4 indicate strong positive associations between alleles in a haplotype and values below −0.75 indicate strong negative associations. Values close to zero reflect random associations of common alleles (linkage equilibrium). As expected, several C:B haplotypes have D′ij values of 1 (e.g. Cw*160101:B*520102 and Cw*1701G:B*4102); 46 haplotypes received scores over 0.4. A:B values begin at 0.88728 (A*3001:B*4202) with 13 haplotypes with scores over 0.4; and B:DRB1 values begin at 0.63285 (B*4201:DRB1*030201) with 9 haplotypes with scores over 0.4.

Table 7.  Pairwise global linkage disequilibrium (LD) estimates
HLA loci pairsD′aWnaP
  • a 

    For D′, values above 0 show a positive association between loci, and values above 0.5 indicate a very strong association. Wn is an alternative measurement of LD, which is interpreted in a similar fashion to D′.

A-C0.494910.32098<0.0001
A-B0.577090.40769<0.0001
A-DRB10.471570.34663<0.0001
C-B0.879570.63329<0.0001
C-DRB10.484820.29744<0.0001
B-DRB10.579890.43992<0.0001

Of the 729 three-locus A:B:DRB1 haplotypes, 73 were found in three or more copies (data not shown). Of the 792 four-locus A:C:B:DRB1 haplotypes, 65 occurred in three or more copies with a combined frequency of 0.27914 (Table 8) and 634 occurred only once summing to a frequency of 0.5633 (data available on www.dodmarrow.org). Ten or more copies of the following four haplotypes were observed: A*3001:Cw*1701G:B*4201:DRB1*030201 (0.02174), A*3601:Cw*0401G1:B*530101:DRB1*110102 (0.01324), A*010101G:Cw*0701G1:B*0801G:DRB1*030101 (0.01321), and A*330301:Cw*0401G1:B*530101:DRB1*080401 (0.00969).

Table 8.  HLA four-locus haplotypes in 564 African American individuals identified in three or more individuals
A:C:B:DRB1 haplotypeaFrequency
  • a 

    Alleles that were identical in exons 2 and 3 were not distinguished with the exception of B*180101. These alleles are indicated by the addition of the letter ‘G’ to the allele name. A list of these alleles is given in Table 2 and can be found at http://www.ebi.ac.uk/imgt/hla/(10).

3001:1701G:4201:030201:0.02174
3601:0401G1:530101:110102:0.01324
010101G:0701G1:0801G:030101:0.01321
330301:0401G1:530101:080401:0.00969
2301G:0210:1503:070101:0.00798
030101G1:070201G1:070201:150101:0.00709
7401G:0701G1:570301:130301:0.00703
2301G:0401G1:440301:070101:0.00621
3002:0802:140201:1503:0.00615
7401G:0210:1503:130201:0.00532
6802:0401G1:530101:130101:0.00532
6802:070201G1:070201:1503:0.00532
680101:0602:5802:120101G:0.00532
0201G1:1701G:4201:030201:0.00526
0201G1:160101:4501G:010201:0.00499
030101G1:1701G:4201:030201:0.00468
3001:1701G:4201:130201:0.00453
3001:0401G1:530101:080401:0.00448
6602:0701G1:5801:1503:0.00443
0205:0401G1:530101:1503:0.00443
2301G:0210:1503:1503:0.00443
290201:160101:440301:070101:0.00443
3402:0401G1:440301:1503:0.00443
2301G:070201G1:070201:070101:0.00399
3001:1701G:4201:080401:0.00356
6601:0602:5802:130101:0.00355
3001:0701G1:070201:110102:0.00355
0201G1:0401G1:530101:130101:0.00355
3402:160101:3501G1:030201:0.00355
7401G:0210:1503:1503:0.00355
2301G:0602:4501G:070101:0.00346
010101G:0701G1:0801G:070101:0.00275
0201G1:160101:4501G:110102:0.00272
0201G1:0701G1:570301:130301:0.00272
330301:1701G:4201:030201:0.00271
2301G:030401G:0801G:1503:0.00266
2301G:070201G1:070201:090102:0.00266
0201G1:050101G:4402G1:070101:0.00266
0201G1:0401G1:440301:1503:0.00266
030101G1:070201G1:070201:010101:0.00266
0202:1801G:570301:130101:0.00266
030101G1:1801G:8101G:1503:0.00266
3601:0401G1:530101:130201:0.00266
330301:140201:1516:010201:0.00266
010101G:0401G1:530101:1503:0.00266
3002:1801G:570301:130101:0.00266
010101G:1701G:4102:130301:0.00266
2301G:0602:530101:130201:0.00266
2301G:0401G1:530101:010201:0.00266
6802:030402:1510:030101:0.00266
6802:030401G:0801G:1304:0.00266
010101G:0602:3701:100101:0.00266
7401G:0210:1503:110102:0.00266
0201G1:050101G:4402G1:130201:0.00266
0102:0701G1:4901:1503:0.00266
030101G1:0401G1:3501G1:010101:0.00266
3402:0401G1:3501G1:1102:0.00266
2601G:030401G:0801G:1304:0.00266
0202:0401G1:530101:080401:0.00266
3001:1701G:4202:030201:0.00266
330301:0701G1:5801:130201:0.00266
2301G:160101:520102:130101:0.00266
2301G:070201G1:070201:1102:0.00266
2301G:160101:4501G:130201:0.00266
330301:0401G1:530101:010201:0.00263

Admixture

To estimate the degree of difference between this population, other African American populations, and African and European populations, Fst (genetic distance) values were calculated between all pairs of populations in a set of sub-Saharan African, European, African American, European American, Afro-Cuban and Euro-Cuban populations for HLA-A:B haplotypes and HLA-C and DRB1 genotypes, considering only variation at the amino acid level for exons 2 and 3 at the class I loci and for exon 2 at the DRB1 locus. Then, the mean pairwise Fst value was calculated within and between each group (sub-Saharan African, European, or African American) and between each African American/Afro-Cuban population and the set of sub-Saharan or European populations. We tested the null hypothesis of population identity (any two population datasets being drawn from the same population) between the African American and the Afro-Cuban populations by considering the P values for each pairwise Fst value, with a significant P value indicating a significant difference between population datasets. These data are summarized in Tables 9 and 10. For each comparison, the mean Fst value for the sub-Saharan African populations is greater than that for the European populations, consistent with previous observations of greater HLA diversity in the sub-Saharan African populations relative to the rest of the world. The mean Fst within the African American populations (for A:B haplotypes and HLA-C genotypes, DRB1 genotypes were not available for another African American population) was an order of magnitude lower than for either of the larger groups, indicating (as might be expected) that these African American populations are not nearly as highly differentiated from one another as the various populations of sub-Saharan Africa and Europe. In addition, the pairwise Fst P values for the African American populations were non-significant for A:B haplotypes, a finding consistent with these datasets being drawn from the same population. However, the African American population reported here and the population reported by Cao et al. (43) differed significantly at the HLA-C locus, and these differences could be because of identification of alleles not known at the time of the previous study (e.g. the recently described Cw*0210 may have been assigned as Cw*0202 in the previous study).

Table 9.  Genetic distance as measured by mean pairwise Fst values between population groups
 A:B haplotypesHLA-C genotypesDRB1 genotypes
  • AfAm, African American; EUR, European; nt, not tested; SSA: sub-Saharan African.

  • a

    For A:B haplotype comparisons, SSA includes populations from Kenya (three total), Senegal, Zimbabwe, Mali, Uganda, Zambia, and South Africa. For HLA-C genotype comparisons, SSA includes populations from Kenya (three total), Zimbabwe, Mali, Uganda, Zambia, and South Africa. For DRB1 genotype comparisons, SSA includes populations from Zimbabwe, Mali, Rwanda, and South Africa.

  • b

    For A:B haplotype comparisons, EUR includes populations from the Czech Republic, Georgia, Finland, Croatia, and Northern Ireland, as well as the European American and Euro-Cuban populations. For HLA-C comparisons, EUR includes populations from the Czech Republic, Georgia, Finland, and Northern Ireland, as well as the European American population. For DRB1 comparisons, EUR includes populations from the Czech Republic, Finland, Slovenia, and Northern Ireland.

  • c 

    AfAm1, the population reported here. AfAm2, population described by Cao et al. (43). AfAm3, Afro-Cuban population (29). DRB1 data were not available for AfAm2. HLA-C and DRB1 data were not available for AfAm3.

Mean SSAa0.015960.019650.03552
Mean EURb0.012940.013440.01910
Mean AfAmc0.000950.00142nt
AfAm1-SSA0.009740.013320.01604
AfAm2-SSAc0.009800.01120nt
AfAm3-SSAc0.01362ntnt
AfAm1 -EUR0.033030.020260.03784
AfAm2-EUR0.030270.01654nt
AfAm3-EUR0.01889ntnt
SSA-EUR0.047780.034010.06124
Table 10.  Test for population identity
 P value of pairwise Fst value
A:B haplotypeHLA-C
  • a 

    AfAm1, the African American population reported here. AfAm2, population described by Cao et al. (43). AfAm3, Afro-Cuban population (29). HLA-C and DRB1 data were not available for this dataset.

AfAm1-AfAm2a0.07207 ± 0.03260.01802 ± 0.0121
AfAm1-Afam30.18919 ± 0.0394 
Afam2-AfAm30.30631 ± 0.0388 

In terms of individual genetic distances between the African American populations and the populations of sub-Saharan Africa and Europe, A:B haplotype comparisons indicate that the African American population described here is closer to sub-Saharan African populations (Fst = 0.00974) than either the African American population described by Cao et al. (0.00980) or the Afro-Cuban population described by Meyer et al. (0.01362) and is more distant from the European populations (0.03303) than the African American population described by Cao et al. (0.03027) or the Afro-Cuban population (0.01889). Given that these differences are marginal and these populations are not significantly different from one another, this pattern still suggests that the degree of European admixture in the African American population described is lower than that in the other two. While the pattern for the HLA-C comparisons, where the African American population described here is more distant from both the sub-Saharan African and the European populations (Fst = 0.01332 and 0.02026, respectively) than the African American population described by Cao et al. (0.01120 and 0.016524), is not identical to that for A:B haplotypes, a lower degree of European admixture, relative to the African American population described by Cao et al., can still be inferred for the population described here.

Some alleles in the African American populations (e.g. A*2407, B*1525, B*2706) were not observed in African populations (42). Some of these alleles were initially identified in American Indian and/or Asian populations or found uniquely in these populations (43) and may reflect admixture with this group (3–5). These individuals also may carry other alleles common to American Indian and/or Asian populations, for example one A*2407-positive individual also carries B*3505, which is first described in an American Indian (46), and DRB1*1202, which is more common in Asian populations (10, 12).

Discussion

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgment
  8. References

This study provides allele and haplotype frequency data from 564 African American individuals. The homozygosity values for the HLA-A, -B, -C, and -DRB1 loci in this African American population are consistent with those seen in other African American and sub-Saharan populations (as well as populations from the rest of the world) (11, 12, 22, 29, 47). For example, significantly low Fnd values were observed at the HLA-A and -C loci in an African American population (29, 43) and at subsets of the class I and DRB1 loci in a variety of the sub-Saharan African populations from Cameroon, Mali, Kenya, Zimbabwe, Ugandan, Zambian, and South Africa (22, 29, 42). In all these populations, strong evidence of balancing selection is seen at the HLA-A and -DRB1 loci, and weak balancing selection (perhaps modulated by other selective forces) can be inferred from negative Fnd values at the HLA-B and -C loci.

Balancing selection may result when evolving pathogens confer selective advantage to low-frequency alleles (frequency-dependent selection), when heterozygosity confers a selective advantage over homozygosity (overdominance), or when changing environmental conditions favor distinct phenotypes (environmental heterozygosity). Alternatively, balancing selection may be inferred erroneously when low-frequency alleles are not detected, skewing the allele frequency distribution in favor of more common alleles. Given that a number of novel alleles have been detected in this population at the HLA-A, -C, and -DRB1 loci, the genotyping method used appears to be sufficiently sensitive to rule out an erroneous inference of balancing selection.

The lack of any overall deviations from expected Hardy-Weinberg equilibrium proportions and the detection of selective forces similar to those seen in other African American and sub-Saharan African populations operating in this population suggest that this collection of subjects represents a valid subset of the African American population. This inference is further supported by the low degree of differentiation observed at the class I loci for this population relative to other African American and Afro-Caribbean population samples and the lack of significant differences between A:B haplotypes in these populations. Thus, this population may be useful for additional diversity studies and can serve as a basis for predicting allele and haplotype frequencies in the search for unrelated hematopoietic stem cell donors from this population.

The data presented in this study can be compared with a previous study of HLA-A, -B, and -C alleles in 252 African American individuals using probe-based testing (43). In that study, fewer alleles were detected at each locus. For example, at the HLA-A locus, Cao et al. identified 32 four-digit alleles. Of these alleles, one was observed once but was not observed in this study. In the current study, 40 four-digit HLA-A alleles were identified, and nine that had not been observed previously were observed once (six alleles) or twice (three alleles). Cao et al. did not detect HLA-B*5704 and Cw*0102, while these alleles were identified 8 and 12 times, respectively, in the current study. These differences are expected because of the increased sample size (564 vs 252) and the use of a higher resolution testing method (sequence-based typing vs sequence-specific oligonucleotide probes), detecting a greater number of known alleles. With the exception of B*5703 (allele frequency 0.04078 in this study vs 0.0040 in the study of Cao et al.) and Cw*0501 (0.0328 vs 0.0198), the same set of frequent (≥0.03) alleles were observed in both studies. A comparison of the 45 B:C haplotypes observed three or more times in the Cao et al. study with the 69 haplotypes observed in this study noted 36 haplotypes identified in both studies, although the relative frequencies differed. Nine haplotypes observed by Cao et al. were not observed. Most haplotypes were infrequent in the previous study.

An early study by Just et al. (7), using probe-based and restriction fragment testing, defined 31 four-digit DRB1 alleles in African American individuals from New York, while 45 were identified in this study. Just et al. also identified DRB1*0803, *0805, and *1103 alleles. Common alleles had similar frequencies with the exception of DRB1*1501 and *1503, which were present at frequencies of 0.160 and 0.006, respectively, in the Just et al. study, compared with 0.02482 and 0.12234 in this study.

The HLA-A, -B, -C, -DR, and -DQ assignments of African American families and unrelated individuals studied in an American Society for Histocompatibility and Immunogenetics (ASHI) minority antigens workshop were obtained at low resolution with higher resolution used to clarify haplotypes (8). About half of the 29 most common HLA-A, -B, and -DR haplotypes from the ASHI study were observed as common in this study. The previous study noted a significantly different geographic distribution of DR antigens in the population drawn from ten geographic regions of the United States. Such differences may be the basis for differences in the frequency of common haplotypes in the two studies.

Finally, Mori et al. (48) identified low-resolution antigen-level HLA-A, -B, and -DRB1 haplotypes from African American individuals. Nine of the ten most frequent haplotypes identified by Mori et al. were also frequent in this study (although not all these nine ranked in the top ten most frequent haplotypes identified in this study). The A2, B7, and DR2 (A*0201G1,B*070201,DRB1*150101) haplotype, also common in the Irish population (49), was not observed among the haplotypes found three or more times in this study but was observed occurring in two different individuals. Comparison with a European population from Northern Ireland of 1000 individuals tested by a probe-based method noted that four of the eight common three-locus haplotypes were found also in African American individuals, suggesting the admixture noted in other studies (3–5). Interestingly, one haplotype was identical at low resolution but differed for the alleles carried A*020101,B*440201,DRB1*150101 in Irish individuals compared with A*0201G1,B*440301,DRB1*1503 in African American individuals.

A comparison can also be made to HLA alleles and haplotypes present in the African populations (42). As expected, many but not all of the common alleles including many but not all of the ‘African’ alleles were found in African American populations. Shared common haplotypes were also observed.

Six new alleles identified in this study were not detected during intermediate resolution probe-based typing and represented 0.13% of the total alleles at the four loci. All of the amino acid substitutions fall at positions that do not appear to affect peptide binding (50). Four of the six carry differences in conserved regions, which explains the failure to detect these differences during probe-based typing. It is likely that changes in the usually conserved positions will be observed in random individuals.

Knowledge of allele and haplotype frequencies in human populations guides the search for an HLA-matched unrelated hematopoietic stem cell donor. A new algorithm developed by the National Marrow Donor Program uses this information to predict the likelihood that a volunteer of a particular ethnicity will carry specific HLA alleles when typed at higher resolution (51). The results of this and similar studies will enhance the information used by the algorithm to improve the predictions, facilitating donor selection and conserving patient resources.

Acknowledgment

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgment
  8. References

This research is supported by funding from the Office of Naval Research N00014-04-1-0398 (CKH and JN) and the National Institutes of Health grant GM35326 (GT and AL). The views expressed in this article are those of the authors and do not reflect the official policy or position of the Department of the Navy, the Department of Defense, or the US government.

References

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgment
  8. References
  • 1
    Curtin PD. The Atlantic Slave Trade: A Census. Madison, WI: The University of Wisconsin Press, 1969.
  • 2
    US Census Bureau. The black population: 2000. http://www.census.gov/2001pubs/c2kbr01-5.pdf.
  • 3
    Parra EJ, Marcini A, Akey J et al. Estimating African American admixture proportions by use of population-specific alleles. Am J Hum Genet 1998: 63: 183951.
  • 4
    Reiner AP, Ziv E, Lind DL et al. Population structure, admixture, and aging-related phenotypes in African American adults: the Cardiovascular Health Study. Am J Hum Genet 2005: 76: 46377.
  • 5
    Smith MW, Patterson N, Lautenberger JA et al. A high-density admixture map for disease gene discovery in African Americans. Am J Hum Genet 2004: 74: 100113.
  • 6
    Beatty PG, Mori M, Milford E. Impact of racial genetic polymorphism on the probability of finding an HLA-matched donor. Transplantation 1995: 60: 77883.
  • 7
    Just JJ, King MC, Thomson G, Klitz W. African-American HLA class II allele and haplotype diversity. Tissue Antigens 1996: 48: 63644.
  • 8
    Zachary AA, Bias WB, Johnson A, Rose SM, Leffell MS. Antigen, allele, and haplotype frequencies report of the ASHI minority antigens workshops: part 1, African-Americans. Hum Immunol 2001: 62: 112736.
  • 9
    Parham P, Ohta T. Population biology of antigen presentation by MHC class I molecules. Science 1996: 272: 6774.
  • 10
    Robinson J, Waller MJ, Parham P et al. IMGT/HLA and IMGT/MHC: sequence databases for the study of the major histocompatibility complex. Nucleic Acids Res 2003: 31: 3114.
  • 11
    Bugawan TL, Mack SJ, Stoneking M, Saha M, Beck HP, Erlich HA. HLA class I allele distributions in six Pacific/Asian populations: evidence of selection at the HLA-A locus. Tissue Antigens 1999: 53: 3119.
  • 12
    Mack SJ, Bugawan TL, Moonsamy PV et al. Evolution of Pacific/Asian populations inferred from HLA class II allele frequency distributions. Tissue Antigens 2000: 55: 383400.
  • 13
    Parham P, Arnett KL, Adams EJ et al. Episodic evolution and turnover of HLA-B in the indigenous human populations of the Americas. Tissue Antigens 1997: 50: 21932.
  • 14
    Cadavid LF, Watkins DI. Heirs of the jaguar and the anaconda: HLA, conquest and disease in the indigenous populations of the Americas. Tissue Antigens 1997: 50: 20918.
  • 15
    Yeager M, Hughes AL. Evolution of the mammalian MHC: natural selection, recombination, and convergent evolution. Immunol Rev 1999: 167: 4558.
  • 16
    Hou L, Tu B, Ling G et al. Strategies for evaluating B*18 allelic diversity by sequence-based typing applied to studies of a population from Singapore and African-Americans. Tissue Antigens 2006: 67: 669.
  • 17
    Cao K, Chopek M, Fernandez-Vina MA. High and intermediate resolution DNA typing systems for Class I HLA-A, -B, -C genes by hybridization with sequence specific oligonucleotide probes (SSOP). Rev Immunogenet 1999: 1: 177208.
  • 18
    Middleton D. HLA-A and -B typing by SSOP. Eur Fed Immunogenet Newslett 1998: 23: 168.
  • 19
    Voorter CE, Swelsen WT., van den Berg-Loonen EM. Intron sequences of HLA-B*73. Tissue Antigens 2001: 57: 4638.
  • 20
    Hurley CK. Sequence-based typing for HLA-A: class I sequencing amplification of exons 2 and 3 based upon Cereb. In: TilanusMG, HansenJA, HurleyCK, eds. IHWG Technical Manual Genomic Analysis of The Human MHC: DNA-Based Typing for HLA Alleles and Linked Polymorphisms 2000: Publication of the 13th International Histocompatibility Working Group Seattle, WA: Fred Hutchinson Cancer Research Center, 2000.
  • 21
    Cereb N, Maye P, Lee S, Kong Y, Yang SY. Locus-specific amplification of HLA class I genes from genomic DNA: locus-specific sequences in the first and third introns of HLA-A, -B, and -C alleles. Tissue Antigens 1995: 45: 111.
  • 22
    Ellis JM, Mack SJ, Leke RFG, Quakyi IA, Johnson AH, Hurley CK. Diversity is demonstrated in class I HLA-A and HLA-B alleles in Cameroon, Africa: description of HLA-A*03012, *2612, *3006 and HLA-B*1403, *4016, *4703. Tissue Antigens 2000: 56: 291302.
  • 23
    Kotsch K, Wehling J, Blasczyk R. Sequencing of HLA class II genes based on the conserved diversity of the non-coding regions: sequencing based typing of HLA-DRB genes. Tissue Antigens 1999: 53: 48697.
  • 24
    Lazaro AM, Steiner NK, Moraes ME et al. Ten novel HLA-DRB1 alleles and one novel DRB3 allele. Tissue Antigens 2005: 66: 3279.
  • 25
    Marsh SG, Albert ED, Bodmer WF et al. Nomenclature for factors of the HLA system, 2004. Tissue Antigens 2005: 65: 30169.
  • 26
    Lancaster A, Nelson MP, Single RM, Meyer D, Thomson G. PyPop: a software framework for population genomics: analyzing large-scale multi-locus genotype data. In: AltmanRB, DunkerK, HunterL, JungT, KleinT, eds. Pacific Symposium on Biocomputing 8. Singapore: World Scientific, 2003: 51425.
  • 27
    Lancaster AK, Single RM, Nelson MP, Solberg O, Thomson G. 14th international HLA and immunogenetics workshop biostatistics and anthropology/human genetic diversity joint report. Chapter 3. PyPop update-a software pipeline for large-scale multi-locus population genomics. Tissue Antigens 2007: in press.
  • 28
    Guo SW, Thompson EA. Performing the exact test of Hardy-Weinberg proportion for multiple alleles. Biometrics 1992: 48: 36172.
  • 29
    Meyer D, Single R, Mack SJ et al. 13th IHWS anthropology/human genetic diversity joint report. Chapter 4. Single locus polymorphism of classical HLA genes. In: HansenJA, DupontB, eds. HLA 2004 Immunobiology of The Human MHC Seattle: IHWG Press, in press.
  • 30
    Ewens W. The sampling theory of selectively neutral alleles. Theor Pop Biol 1972: 3: 87112.
  • 31
    Watterson G. The homozygosity test of neutrality. Genetics 1978: 88: 40517.
  • 32
    Slatkin M. An exact test for neutrality based on the Ewens sampling distribution. Genet Res 1994: 64: 714.
  • 33
    Slatkin M. A correction to the exact test based on the Ewens sampling distribution. Genet Res 1996: 68: 25960.
  • 34
    Salamon H, Klitz W, Easteal S et al. Evolution of HLA class II molecules: allelic and amino acid site variability across populations. Genetics 1999: 152: 393400.
  • 35
    Dempster A, Laird N, Rubin D. Maximum likelihood estimation from incomplete data using the EM algorithm. J R Stat Soc 1977: 39: 138.
  • 36
    Excoffier L, Slatkin M. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol Biol Evol 1995: 12: 9217.
  • 37
    Hedrick PW. Gametic disequilibrium measures: proceed with caution. Genetics 1987: 117: 3314.
  • 38
    Lewontin RC. The interaction of selection and linkage. II. Optimum models. Genetics 1964: 50: 75782.
  • 39
    Lewontin RC. On measures of gametic disequilibrium. Genetics 1988: 120: 84952.
  • 40
    Cramer H. Mathematical Methods of Statistics. Princeton, NJ: University Press, 1946.
  • 41
    Excoffier L, Laval G, Schneider S. Arlequin (version 3.0): an integrated software package for population genetics data analysis. Bioinformatics Online 2005: 1: 4750.
  • 42
    Cao K, Moormann AM, Lyke KE et al. Differentiation between African populations is evidenced by the diversity of alleles and haplotypes of HLA class I loci. Tissue Antigens 2004: 63: 293325.
  • 43
    Cao K, Hollenbach J, Shi X, Shi W, Chopek M, Fernandez-Vina MA. Analysis of the frequencies of HLA-A, B, and C alleles and haplotypes in the five major ethnic groups of the United States reveals high levels of diversity in these loci and contrasting distribution patterns in these populations. Hum Immunol 2001: 62: 100930.
  • 44
    Bodmer JG, Marsh SGE, Albert ED et al. Nomenclature for factors of the HLA system, 1996. Tissue Antigens 1997: 49: 297321.
  • 45
    Bugawan TL, Klitz W, Blair A, Erlich HA. High-resolution HLA class I typing in the CEPH families: analysis of linkage disequilibrium among HLA loci. Tissue Antigens 2000: 56: 392404.
  • 46
    Belich MP, Madrigal JA, Hildebrand WH et al. Unusual HLA-B alleles in two tribes of Brazilian Indians. Nature 1992: 357: 3269.
  • 47
    Klitz W, Thomson G, Baur MP. Contrasting evolutionary histories among tightly linked HLA loci. Am J Hum Genet 1986: 39: 3409.
  • 48
    Mori M, Beatty PG, Graves M, Boucher KM, Milford EL. HLA gene and haplotype frequencies in the North American population-the National Marrow Donor Program Donor Registry. Transplantation 1997: 64: 101727.
  • 49
    Williams F, Meenagh A, Single R et al. High resolution HLA-DRB1 identification of a Caucasian population. Hum Immunol 2004: 65: 6677.
  • 50
    Bjorkman PJ, Parham P. Structure, function, and diversity of class I major histocompatibility complex molecules. Annu Rev Biochem 1990: 59: 25388.
  • 51
    Hurley CK, Wagner JE, Setterholm MI, Confer DL. Advances in HLA: practical implications for selecting adult donors and cord blood units. Biol Blood Marrow Transplant 2006: 12: 2833.