Determination of HLA-A, -C, -B, -DRB1 allele and haplotype frequency in Japanese population based on family study.

Abstract The present study investigates the human leucocyte antigen (HLA) allele and haplotype frequencies in Japanese population. We carried out the frequency analysis in 5824 families living across Japanese archipelago. The studied population has mainly been typed for the purpose of transplant, especially the hematopoietic stem cell transplantation (HSCT). We determined HLA class I (A, B, and C) and HLA class II (DRB1) using Luminex technology. The haplotypes were directly counted by segregation. A total of 44 HLA‐A, 29 HLA‐C, 75 HLA‐B, and 42 HLA‐DRB1 alleles were identified. In the HLA haplotypes of A‐C‐B‐DRB1 and C‐B, the pattern of linkage disequilibrium peculiar to Japanese population has been confirmed. Moreover, the haplotype frequencies based on family study was compared with the frequencies estimated by maximum likelihood estimation (MLE), and the equivalent results were obtained. The allele and haplotype frequencies obtained in this study could be useful for anthropology, transplantation therapy, and disease association studies.


Introduction
The human leukocyte antigen (HLA) gene family is characterized by extreme degree of genetic polymorphism and linkage disequilibrium (LD). The varieties in polymorphism and LD patterns of HLA gene family show a tendency to be unique in each ethnic group (1,2). HLA antigens have been known to play an important role in immune responses. In hematopoietic stem cell transplantation (HSCT), HLA matching between donors and recipients lowers the risk of graft rejection and graft-versus-host disease (GVHD) (3,4). Morishima et al. suggested that the genetic difference derived from HLA haplotype is associated with acute GVHD in allogeneic HSCT (5). Therefore, HLA haplotype cannot be excluded from consideration during donor selection because of potential contribution from proteins encoded by non-HLA genes inherited with HLA genes.
Several studies reported analysis of HLA allele and haplotype frequency data in the Japanese population (6)(7)(8)(9). However, these studies failed to present accurate and detailed information due to the haplotypes estimation using software or small sample size, or both. This necessitates developing a method which can produce accurate and detailed gene distribution. The present The copyright line for this article was changed on 23 September 2016 after original online publication. study aims to obtain a more exact and detailed HLA haplotype distribution from 18,604 members of 5824 Japanese families, whose HLA haplotypes were determined by descent. Our study also attempts to determine the frequency of specific haplotypes, C-B, A-B-DRB1, and A-C-B-DRB1, used in donor search. In addition, it was ascertained whether the haplotype frequencies estimated by maximum likelihood estimation (MLE) would be equivalent to the frequencies found in the present family study.

Subjects
A total of 18,604 members (including patients and normal subjects) from 5824 families, distributed in all parts of Japan, were enrolled for this study. Among these families, there were patients, considered for transplantation, especially HSCT. The families were divided into three groups (Table 1): (i) families with both parents with one or more children, (ii) families with one parent with one or more children, and (iii) families with no parents but having two or more children. The families with more than two generations were counted as separate families. Informed consent was obtained from all the participants of this study by the clinicians who ordered HLA typing.
For comparing the haplotype frequencies obtained by family study and using MLE software, unrelated 4500 people were chosen at random from the total subjects of present study (18,604 members). They were genetically unrelated, because one person was chosen from each of 4500 random chosen families. The overlaps of blood relationship with three or more generation were avoided in these families. The allele and haplotype frequencies calculated from these 4500 people were very similar to the frequencies from the total subjects [allele frequencies (AF) data not shown].

Samples
DNA samples were obtained from peripheral lymphocytes or buccal cells using a JetQuick ® Blood & Cell Culture Kits (GENOMED, Löhne, Germany) or QuickGene DNA Tissue Kit (KURABO, Osaka, Japan) according to the manufacturer's protocols.

HLA allele typing
HLA (-A, -C, -B, and -DRB1) four-digit allele typing was performed using Luminex 200 system (Luminex, Austin, TX) and WAKFlow HLA Typing kit (Wakunaga, Hiroshima, Japan) (10)(11)(12). HLA alleles were assigned automatically using WAKFLOW Typing software (Wakunaga, Hiroshima, Japan). The primer sequences of WAKFLOW Typing kit are specifically designed to make allele determination easier in Japanese population, and by default the analysis with WAKFLOW Typing software is based on the AF of the donors registered with Japan Marrow Donor Program (JMDP) which are available on the website, www.bmdc.jrc.or.jp (9). Therefore, this method can determine alleles with frequencies of 0.1% and greater in the Japanese population. A few alleles which could not be determined by this method as rare alleles were considered as secondary; these alleles were determined using Luminex 200 system and LAB Type SSO kit (One Lambda, Los Angeles, CA) assigned using the HLA Fusion software (One Lambda). In brief, exon 2 for HLA-DRB1; exons 2 and 3 for HLA-A, -B, and -C were amplified in these methods.

Haplotype determination
The haplotypes were determined by segregation. This study was designed with the aim to assess genetic linkage with high certainty. Nevertheless, the results also included some partial haplotypes because of the possibility of one or more recombination in a family. The haplotype frequencies were calculated by using haplotypes without taking into account the recombination. Thus, we counted the haplotypes of parents and not of children because fathers and mothers are genetically unrelated, while the haplotypes created by recombination were those of children only. Some haplotypes of parents whose children had recombinant haplotypes could not been determined because of two patterns of their combinations. These haplotypes were determined and counted as not less frequent but frequent haplotype phase. Specifically, assuming that four haplotypes are Hp1, Hp2, Hp3, and Hp4, those frequencies are HF1, HF2, HF3, and HF4, and two estimable phases are 'Hp1, Hp2' and 'Hp3, Hp4', HF1 was multiplied by HF2, HF3 was multiplied by HF4, and the haplotypes of the estimable phase with larger product were counted.
For the comparison to the result by MLE, the haplo.em program, which was evaluated by HAPLO.STATS (version 1.6.0) software operated in the R language, was used (13)(14)(15)(16). For genetic markers measured on unrelated subjects with linkage phase unknown, this program computes the maximum likelihood estimates of haplotype probabilities using the progressive insertion algorithm that progressively inserts batches of loci into haplotypes of growing lengths.

Statistical analysis
The haplotypes were counted manually using Microsoft Excel ® spreadsheets. The haplotypes that extended three or more generations were counted once ( Figure 1). The allele and haplotype frequencies were calculated by using 19,183 haplotypes, counted as mentioned above. Relative LD values (RD) were computed for each haplotypes (17,18). The exact test for deviation from Hardy-Weinberg Equilibrium were evaluated by GENEPOP software, version 4.2 (19,20), which uses a Markov Chain (MC) algorithm (dememorization = 10,000, batches = 10,000, and iterations per batch = 10,000) to estimate the P-value. The expected prevalence (P) of the allele or the haplotype under Hardy-Weinberg proportions were calculated from AF by using the following equation: P = 1 − (1 − AF) 2 . Table 2 presents the list of the AF of HLA-A, -B, -C, and -DRB1 loci (21). We identified 44 HLA-A, 75 HLA-B, 29 HLA-C, and 42 HLA-DRB1 alleles and found A*24:02 to be 36.48%, the highest in Japanese population; thus, it is distributed in approximately 60% of the population. The alleles underlined in Table 2 need specific attention for HLA allele matching in unrelated HSCT between the Japanese, as they are present at high frequencies within a serotype (allele family) match. Table 3 lists 60 haplotypes with frequencies higher than 0.2% in the population. Approximately, 38% of the entire Japanese population is expected to carry one or two of the five most common haplotypes.

Haplotype frequencies of HLA-A-C-B-DRB1
These data have also been submitted to Allele Frequency Net Database (AFND) (22). The four-loci haplotypes with frequencies equal to or more than 0.01% and the AF can be found at the AFND website, www.allelefrequencies.net (22). Table 4 lists the sets of three-loci haplotypes at frequencies of 0.2% or greater, which would have the same serotype. Table 5 lists 64 haplotypes with frequencies higher than 0.1 % in the population. Half of these haplotypes have RD values

Observed recombination
A total observation number of recombination events were 136 in 134 families. These were divided into two groups: (i) the haplotypes of 103 parents (75.7%) could be determined, (ii) the haplotypes of 33 parents (24.3%) could not be determined and thus were inferred. Table 6 summarizes the observation number of recombination events in informative families which contain the parents and three or more children. The genotypes   Table 6 also indicates the recombination probabilities of HLA-A-C and B-DRB1 are 0.54%. Table 7 shows the haplotype frequencies of HLA-A-C-B-DRB1 based on family study (result-FS) and based on MLE (result-MLE) with frequencies more than 0.5%. In the frequent haplotypes with frequencies not less than 0.12%, result-MLE tends to be higher than result-FS. In the low-frequent haplotypes with less than 0.12%, result-MLE tends to be lower than result-FS. In addition, result-MLE could not be detected in 585 haplotypes of the 2099 haplotypes with frequencies less than 0.12% in result-FS. On the allele data used for this comparison, four loci showed Hardy-Weinberg Equilibrium: the P-values of the exact test at HLA-A, -B, -C, and DRB1 loci were 0.3184, 0.2557, 0.1449, and 0.4998, respectively.
The haplotype analysis not only helps in understanding the history of human migration but also in matching for unrelated donor searches for HSCT. In JMDP, the HLA compatibility with a donor is evaluated by both serotype and genotype. Furthermore, allele matching is a better evaluation of compatibility compared to the serotype matching; the rejection and GVHD risk of bone marrow transplant (BMT) have been found to be lower with allele-level matching compared to the serotype-level matching (3,32). Accordingly, focusing on the alleles in every serotype, A2, A26, etc. increases the possibility of allele mismatch in spite of serotype match ( Table 2). As shown in Table 4, HLA-A2, B61, B15, and DR4 especially increase the allele mismatch risk. However, compared with South Korean haplotype analysis, South Korea has 15 sets of haplotypes to pay attention to in HSCT matching (27) but Japan has eight sets (Table 4). It also shows that for most HLA haplotypes at the serotype-level, HLA haplotype matching is almost allele-level matching in HSCT between the Japanese.
Although HLA-C locus is not indispensable for registering information in HSCT, HLA-C allele matching is important (32,33). Even without information of donor HLA-C allele, the C-B haplotypes can predict HLA-C allele, although not always, because of well-conserved C-B linkage (Table 5); the conservation is possible due to short genetic linkage distance. In other words, HLA-B allele matching increases the possibility of HLA-C allele matching. Accordingly, analysis in the HLA distribution in Japanese population may contribute in planning the strategies of HLA matching for HSCT.
Result-MLE was similar to result-FS (Table 7). Table 7 indicates the haplotype frequencies estimated by the software are very similar to real family-derived haplotypes. If the detection of the low frequency haplotypes is needed, determination of haplotypes by descent or a large sample size appears to be necessary.
In conclusion, this study of determination of HLA allele and haplotype frequencies using family samples not only serves as a tool for elucidating linkage of each HLA locus but also acts as a tool in detecting HLA gene mutations in human germ cells such as recombination. The data obtained in this study will be useful in various fields such as anthropology, transplantation therapy, and disease association studies.