Genetic analysis of pharmacogenomic VIP variants in the Blang population from Yunnan Province of China

Abstract Background Genetic polymorphisms in numerous pharmacogenetics studies were regarded as the essential factors involved in the response to or metabolism of drugs. These genetic variants called very important pharmacogenetic (VIP) variants played a role in drugs metabolism, which have been summarized in the PharmGKB database. In this study, we genotyped 80 VIP variants from the PharmGKB in 100 members of Blang volunteers from Yunnan province. Methods Based on the PharmGKB database, we genotyped 80 VIP variants loci located in 47 genes. We used χ2 tests to evaluate the significant loci between Blang and the other populations, including ASW, CEU, CHB, CHD, GIH, JPT, LWK, MEX, MKK, TSI, and YRI. The global variation distribution of the significant variants was observed from the ALlele FREquency Database. And then, we used F‐statistics (Fst), genetic structure, and phylogenetic tree analyses to ascertain the genetic affinity among 12 populations. Results Comparing the Blang with the other 11 populations from the HapMap Project, the statistical results revealed that rs3814055 (NC_000003.12:g.119781188C>T) of nuclear receptor subfamily 1 group I member 2 (NR1I2, OMIM# 603,065) was the most significant variant, followed by rs1540339 (NC_000012.12:g.47863543C>T) of vitamin D receptor (VDR, OMIM#601,769). Furthermore, we found that genotype frequency of rs3814055 in the Blang was closer to the populations distributed in Miao. And genetic structure and F‐statistics indicated that the Blangs had a relatively closer affinity with CHD, CHB, and JPT populations. In addition, the Han nationality in Shaanxi was closer to it. Conclusions Our results will complement the pharmacogenomics information of the Blang ethnic group and provide a theoretical basis for safer drug administration for Blang.


| INTRODUCTION
Personalized medicine (Jain, 2009) simply means selection of a best treatment suited for a person on a comprehensive consideration of each patient's characteristics. Its scope is more wider, including pharmacogenetics, pharmacogenomics, and so forth. Pharmacogenomics, a crucial foundation for the development of personalized medicine and patient medication management, enables therapy more precisely.
Furthermore, the Pharmacogenomics Knowledge Base (PharmGKB: http://www.pharmgkb.org) is an extremely useful resource for explaining the gene-drug-disease relationships, more importantly, supporting personalized medicine projects. Recently, a large number of pharmacogenomics studies focused on genetic variations considered to be involved in response to or metabolism of drugs (Evans & McLeod, 2003). These genetic variations also called very important pharmacogenetic (VIP) variants (Peters & McLeod, 2008). At present, there were a total of 246 VIP variants located in 66 genes, which have been summarized in the PharmGKB database.
Numerous studies have elucidated that the importance of ethnicity is great in influencing the frequencies of gene variants. There are 56 ethnic minorities in China, including the Blang ethnic group. The Blang nationality has a population of 91,882 (the fifth national census statistics in 2000), most of whom live in Mount Blang, Xiding, Bada, Mengman, and Daluo areas of Menghai County in Xishuangbanna Dai Autonomous Prefecture of Yunnan province of Southwest China. The others distribute in ***Lincang, Simao, and Baoshan areas (Wang, Hu et al., 2008a). The areas they live in are mild climate and rich products. They are mainly engaged in agricultural production, especially tea planting, which is the origin of the famous Pu'er tea.
This study aims to determine the Blang's genotype and allele frequencies distribution of pharmacogenetic variants. And we compare Blang with the 11 HapMap populations and two national minorities to assess the differences in allele frequencies. The results will complement the database information of pharmacogenomics, better understand the Blang nationality, and provide them with more reasonable individualized health management in the future.

| Ethical compliance
All participants were informed both in writing and verbally to the procedures and purpose of the study and signed informed consent documents. The study protocol was approved by the Clinical Research Ethics Committee of Xizang Minzu University. It is in accordance with the Department of Health and Human Services (DHHS) regulations for human research subject protection.

| Study participants
We randomly recruited about 100 unrelated, healthy Blang people from the Yunnan Province of China. Each participant has undergone rigorous screening criteria. None of the subjects had any diseases including self-reported cancer history and other diseases. Moreover, despite the influence of the Han and Dai people whose economy and culture development are relatively rapid, they still maintain the characteristics of the nation. They can be seen as representatives of the Blang population.

| Variant selection and genotyping
We chose 80 VIP variants loci located in 47 genes from the PharmGKB database. Genomic DNA was extracted from peripheral blood sample using the GoldMag-Mini Whole Blood Genomic DNA Purification Kit (GoldMag Ltd. Xi'an, China) according to the manufacturer's protocol. NanoDrop 2000C spectrophotometer (Thermo Scientific, Waltham, MA) was used to measure the DNA concentration. We utilized the Sequenom MassARRAY Assay Design 3.0 Software (San Diego, CA) to design Multiplexed SNP MassEXTEND assays (Gabriel, Ziaugra, & Tabbaa, 2009) and genotyped the variants using Sequenom MassARRAY RS1000 (San Diego, CA). Based on the Sequenom Typer 4.0 software (San Diego, CA) used in previous research Jin, Aikemu et al., 2015a;Jin, Yang et al., 2015b;Thomas et al., 2007), we completed data management and analyses.

| Statistical analyses
We performed χ 2 tests and Hardy-Weinberg equilibrium (HWE) analysis by the Microsoft Excel (Redmond, WA) and SPSS 19.0 statistical software platform (SPSS, Chicago, IL). The genotype frequencies of 80 variants in the Blang population were separately compared with those of the other populations, including the Chinese Han in Beijing, China (CHB); the Chinese of metropolitan Denver, Colorado, USA (CHD); the Japanese in Tokyo, Japan (JPT); a residents population in Utah with Northern and Western European Ancestry (CEU); the Gujarati Indians in Houston, Texas, USA (GIH); people with Mexican ancestry living in Los Angeles, California, USA (MEX); the Tuscan people of Italy (TSI); a population of African ancestry in the southwestern USA (ASW); the Luhya people in Webuye, Kenya (LWK); the Maasai people in Kinyawa, Kenya (MKK); and the Yoruba in Ibadan, Nigeria (YRI). All p values of less than 0.05 obtained in this study were two-sided and Bonferroni's multiple tests were used to calculate the level of significance. After Bonferroni's multiple adjustment, we attempted to discover significantly different sites (p < [0.05/(80 × 11)]). Subsequently, we downloaded significant SNP allele frequencies from the ALlele FREquency Database (http://alfred.med.yale.edu, ALFRED) and analyzed the global genetic variation patterns from the HapMap database (Gibbs et al., 2003).

| Population genetic structures analysis
In view of the genetic structure of human populations, we used Structure 2.3.4 (Pritchard Lab, Stanford University, USA) (http://pritchardlab.stanford.edu/software/ structure_v.2.3.4.html) to observe the variation of the selected VIP variants. On the basis of the Bayesian clustering algorithm approach, we performed structural analysis to assign the samples within a hypothetical K number of populations hypothesized by Pritchard, Stephens, and Donnelly (2000). The MCMC analyses for each structure analysis (K = 3-10) was run for 10,000 steps after an initial burn-in period of 10,000 steps. And we used △K to calculate to identify the most likely number of clusters by STRUCTURE HARVESTER (Evanno, Regnaut, & Goudet, 2005). Moreover, Wright's F-statistics is the most widely used descriptive statistics in population and evolutionary genetics. (Wright, 1931). We used the program Arlequin version 3.1 to calculate the Fst values to deduce the pairwise distance between populations. Besides, neighbor-joining method was used to group them in several clusters based on the genetic distance.

| Basic information of the VIP variants
We selected 80 VIP variants from PharmGKB database in 100 members of the Blang population.
The selected single-nucleotide polymorphisms (SNPs) of PCR primers (listed in Table S1) were designed by the Sequenom MassARRAY Assay Design 3.0 Software. The basic information of the selected variants has been shown in Table 1, including the genes name, their positions, the nucleotide change, the amino acid translation, the allele frequencies, and the genotype frequencies of Blang and the like.

| Analyses of 80 loci among 12 populations
The average variants call rate of the results was over 95%. All selected loci meet the HWE. Using chi-square test, we compared the Blangs and the 11 populations of the genotype frequencies distribution of 80 loci. Before adjustment (p < 0.05), we found that some loci were different (not shown). When compared to the 11 groups (ASW, CEU, CHB, CHD, GIH, JPT, LWK, MEX, MKK, TSI, and YRI) and Blang without adjustment, the number of significantly different variants in the Blang population was 23, 30, 17, 30, 30, 21, 26, 21, 25, 22, and 35, respectively (data no shown). After adjustment (p < [0.05/(80 × 11)], listed in Table 2), there were 15,20,6,25,25,7,19,7,20,15, and 26 loci of significant differences between Blang and the 11 populations, respectively. While there were contrasts in the two sets of data, there were also similarities. It was also noteworthy that the different loci between CHB and the Blang were the least.
After analysis of Table 2, significant variants in some genes were distributed in every population, such as VDR and NR1I2. There were rs10735810, rs11568820, rs1540339, rs1544410, rs2228570, rs2239179, rs2239185, rs731236, and rs7975232 distributed in VDR (vitamin D receptor), which encodes the nuclear hormone receptor for vitamin D3. Although failing to make amino acid changed, rs1540339 was also very significant among the nine populations except CHB, JPT, and MEX. Although rs2228570 (HGVS: NM_000376.2:c.2 T>G) was, the only one SNP changing amino acid, located in exon 2 of VDR, it was still prominent in the CHD.
Although rs3814055 in NR1I2 changed little, significant differences still existed. We downloaded the associated data of rs3814055 from the website (http://alfred.med.yale.edu). As seen from the Figures 1 and 2, the frequency of the Blangs was closer to the populations distributed in East Asia, especially Miao. On the whole, the frequencies of the allele C of rs3814055, ranged from 67% to 94%, were higher in East Asia than the other populations. The Blang population was the highest among them, so attention should be paid to its allele C.

| The relationship between 23 populations
We used Structure 2.3.1 Software to analyze the genetic structure of the 23 populations in order to further identify the relationships between them throughout the world. Different K values ranging from 2 to 10 were hypothetically in structure analysis. And, the results of K = 2,3 among global populations and the results of K = 3,4 ethnic groups from China Notes. SNP: single-nucleotide polymorphism; HWE: Hardy-Weinberg equilibrium. The GenBank reference of the above genes were as follows: ABCB1 (NC_000007.14), ADH1A (NC_000004.12), ADH1B (NC_000004.12), ADRB1 (NC_000010.11), ADRB2 (NC_000005.10), AHR (NC_000007.14), ALDH1A1      were shown in Figure 3. The cluster analysis indicated that when K = 3, the group was divided into three subgroups (subgroups 1: Blang, CHB, CHD, JPT, SX Han; subgroups 2: CEU, GIH, MEX, TSI, Deng, Sherpa, Lhoba, Kyrgyz, Tajik, Uygur; subgroups 3: ASW, LWK, MKK, YRI, Miao, Li, Tibet, Mongol) based on relative majority of likelihood to assign individuals to subgroups. The results illustrated that Blang had a relatively closer affinity with CHB, CHD, and JPT. In accordance with the Table 2, the results were confirmed. Likewise, when comparing ethnic groups within China, we found that Blang was closer to SX Han. Based on genetic structure, we further assessed the genetic relationship among 12 populations by using pairwise Fst values (Table 3). As mentioned in it, it was clear that the differences between CHB, CHD, JPT, and Blang (Fst = 0.04728, 0.04259, and 0.04914, respectively) were smaller. The smaller the Fst value, the more similar they were. The results indicated that the Blang and the other three groups had a SNP ID Gene

| DISCUSSION
There is increasing interested in personalized medicine, because of genetic variations leading to each person's different metabolism of and reactions to some drugs. In our results, we genotyped the pharmacogenomic VIP variants in the Blang population. The conclusion was that that NR1I2 rs3814055 was the most significant variant among the 12 selected populations, followed by VDR rs1540339. Using genetic structure analysis and Fst values, we also concluded that the genetic backgrounds of the Blang were similar to CHB. Pregnane X, encoded by the gene NR1I2, belongs to the nuclear hormone receptor superfamily, whose major role is to promote the detoxification and clearance of drugs and toxic xenobiotics from the body as a transcription factor (Bertilsson et al., 1998). And some CYPs (Ding et al., 2015;Jin, Zhang, Shi et al., 2016a;Jin, Zhang, Geng et al., 2016b;Shan et al., 2016;Zhang et al., 2016) regulated by PXR/NR1I2 were associated with phase I metabolism in human. Moreover, some studies (Lown et al., 1997;Shimada, Yamazaki, Mimura, Inui, & Guengerich, 1994) illustrated that SNPs in PXR may be a main reason to the differences in drug reactions and the induction of CYP3A4. Rs3814055, localized in the 5' untranslated region (UTR) of NR1I2, has already attracted the attention of many researchers, for both disease risk and pharmacogenomics impact. Numerous studies showed that the frequency of rs3814055 in the NR1I2 gene varied according to different populations. The frequency of this variation in a Chinese Han population was 0.218 (Wang et al., 2007), 0.39 for Caucasians (Zhang et al., 2013), 0.21 for Asians (King et al., 2007), 0.50 for Europeans (King et al., 2007), 0.36 for the Dutch (Bosch et al., 2006), and 0.34 for African Americans (Thomas et al., 2007). In our previous studies, the frequency of the rs3814055 SNP variant in the Lhoba population and in the Miao population were 0.101 and 0.09 Jin, Aikemu et al., 2015a), respectively. In our study about the Blangs, the allele T frequency of rs3814055 was 0.06 (Figures 1 and 2). In a Chinese Han Population, upregulated CYP3A4 expression was due to the frequency of rs3814055 (−25,385 T) (Zhang et al., 2001), demonstrating that it was similar to that of Lhoba and Miao. Yet it was still lower than the other populations. Additionally, another report has shown that the allele C linked to Inflammatory Bowel Diseases (IBD) in a European population (Martínez et al., 2010). However, the haplotype TCC of rs3814055/rs6784598/rs2276707 functioned as a whole in risk assessment for ulcerative colitis (UC) in Spanish population. In addition, Kurzawski M et al revealed that there were significant differences in tacrolimus concentrations between patients with different NR1I2 rs3814055: C > T genotypes (Kurzawski, Malinowski, Dziewanowski, & Drozdzik, 2017). And Zazuli et al. (2015) found that, in Indonesian patients with tuberculosis, the TT genotype of rs3814055 had a significantly greater risk of antituberculosis drug-induced liver injury than those of CC genotype. The SNP rs1540339 is situated in the intron region of VDR. Previous studies have demonstrated that rs1540339 was related to the susceptibility of type 1 diabetes mellitus (T1DM) , colorectal cancer (Wang, Li, & Zhou, 2008b), and so on. The other study drew the same conclusion that the variant involved in T1DM prevention (Wang, Li et al., 2008b). Jin TB et al. reported that the frequency of rs1540339 T in the Li population was higher than the allele C, indicating that the Li group had lower sensitivity to T1DM. In our study, the allele frequencies of rs1540339 C/T in the Blang were 34% and 66%, respectively. So we guess that the Blang may have lower susceptibility to T1DM.
Considering the above results, ethnicity is an important factor for the frequency distribution and the genotype of rs3814055 can be used as a marker for detecting IBD and UC. And the Blang may have a lower susceptibility to T1DM. Although rs1540339 has not been found to be relevant in the Blang, it is noteworthy in future studies. At present, there are more teams, including Jin TB et al., devoted to disease research of SNPs (Du et al., 2016;Duan et al., 2015;Hu et al., 2014;Wang et al., 2015;Yang et al., 2016), and we hope that our data will complement the pharmacogenomics database and provide some help for the development of personalized medicine.