Effects of RHD gene polymorphisms on distinguishing weak D or DEL from RhD− in blood donation in a Chinese population

Abstract Background Weak D or DEL red blood cell units may be mistyped as RhD− by current serology assays, which can lead to incompatible transfusion to RhD− recipients and further cause anti‐D immunization. Molecular RHD blood group typing is a very effective method for overcoming current technical limits. The purpose of this study was to identify RHD single‐nucleotide polymorphisms (SNPs) and compare the genotype prevalence among confirmed RhD− individuals in a Chinese population as well as explore effective biomarkers for current weak D or DEL detection before blood transfusion. Methods In the present study, 125 weak D (1, 2, 3, and 4.1) or DEL and 185 RhD− blood samples from donors detected by current standard serology were collected. Genotyping system was used to analyze the SNPs of RHD in each sample. Results Seven SNPs (rs592372, rs11485789, rs6669352, rs3118454, rs1053359, rs590787, and rs3927482) were detected in the RHD region. Rs3118454, rs1053359, rs590787, and rs3927482 showed significant differences between the weak D (1, 2, 3 and 4.1) or DEL and RhD− groups. Further combined analysis of the allelic distribution of these four SNPs revealed their higher frequencies in the RhD− group. Conclusion The SNPs rs3118454, rs1053359, rs590787, and rs3927482 in RHD showed a significantly higher frequency among an RhD− Chinese population and are potential biomarkers.

Genetic factors have recently been recognized as important risk factors in the pathogenesis of many complex diseases such as hypertension, obesity, coronary artery disease, and diabetes (Joy, Lahiry, Pollex, & Hegele, 2008;Shore, 2008;Topol, Smith, Plow, & Wang, 2006;Wang, 2005). However, the genetic mechanisms of many diseases remain largely unknown. Single-nucleotide polymorphisms (SNPs), as common DNA sequence variations, occur at single bases within genomic DNA and increase human genetic variation. SNPs are reported to be associated with many human diseases and in some cases are direct causative factors (Brookes, 1999;Hemminki & Bermejo, 2005;Yamada, 2008). Researchers have found that SNPs in candidate genes influenced individuals' susceptibility to oncogenesis, including renal cell cancer (Cao et al., 2012;Purdue et al., 2011), acute lymphoblastic leukemia (Huang et al., 2012), esophageal cancer (Hildebrandt et al., 2009), bladder cancer (Chen et al., 2009), colon cancer, rectal cancer (Slattery et al., 2010), and lung cancer (Pu et al., 2011).These polymorphisms are located in specific coding or noncoding regions of genes and influence gene expression. However, whether SNPs exist in RHD (OMIM accession number: 111680) DNA sequences and whether they influence gene expression remains unclear.
In this study, we analyzed blood samples to identify potential SNPs in RHD using the HapMap system. We also conducted a SNaPshot assay to identify novel RHD SNPs and determined their association with RhD phenotypes in a Chinese population.

| Ethical compliance
The use of human blood samples in this study was approved by the independent ethics committee of Nanjing Red Cross Blood Center on 5 January 2011 . Blood samples used for SNP analysis were remaining samples after clinical use from blood donors who voluntarily donated whole blood at our blood center. All volunteers signed an informed consent statement to approve the use of their remaining sample.

| Blood samples and immunohematology
Ethylenediaminetetraacetate-anticoagulated blood samples were randomly collected from 300 blood donors at the Nanjing Red Cross Blood Center during a 3-year period starting in January 2011. All donors were of Chinese Han ethnicity. Their Rh phenotypes were determined using standard serological kits according to the manufacturer's instructions (Gamma Biologicals, Houston, TX).

| SNP selection
To assess the power for detecting associations due to linkage disequilibrium (LD) with causal loci, we carried out power calculations for an indirect association study that uses Tag-SNPs. SNPs in RHD (GenBank reference sequence version number: NG_007494.1) were selected based on HapMap data (http://hapmap.ncbi.nlm.nih.gov/) and dbSNP data (http://www.ncbi.nlm.nih.gov/projects/SNP/). Potentially functional SNPs were identified according to the following criteria: (a) located in the 5′ flanking regions, 5′ untranslated region (UTR), 3′UTR, or coding regions with amino acid changes; or (b) minor allele frequency of 5% in the Chinese population. According to these criteria, seven SNPs were identified in RHD: rs11485789, rs1053359, rs3118454, rs590787, rs6669352, rs592372, and rs3927482.

| DNA extraction and genotyping
Genomic DNA was extracted from the donated peripheral blood by proteinase K digestion and phenol-chloroform extraction and stored at −80°C. Genotyping of these seven SNPs was performed using predesigned TaqMan SNP Genotyping Assays (Applied Biosystems, Foster City, CA). A 3.5-μl reaction mixture containing 20 ng of genomic DNA, 10 μl of 2× TaqMan Genotyping Master Mix, 0.25 μl of the primers, probes mix, and 6.25 μl of double distilled water was prepared. Amplification was performed under the following conditions: 50°C for 2 min, 95°C for 10 min followed by 45 cycles at 95°C with 15 s for each cycle, and 60°C for 1 min. Amplification and analysis were performed in the 384-well ABI 7900HT Real Time PCR System (Applied Biosystems) following the manufacturer's instructions. SDS 2.4 software (Applied Biosystems) was used for allelic discrimination. The genotyping rates of these SNPs were all above 98%. For quality control, four negative controls were included in each plate and 5% of the samples were randomly selected for repeated genotyping for confirmation; the results were 100% concordant (Cao et al., 2012).

| Statistical analysis
Differences in the distributions of demographic characteristics, selected variables, and frequencies of genotypes between weak D or DEL and RhD− groups were tested using Student's t-test (for continuous variables) or χ 2 -test (for categorical variables). SNP frequencies in RhD− participants were tested against departure from Hardy-Weinberg equilibrium using a goodness-of-fit χ 2 -test before further analysis. The associations between SNPs and weak D or DEL were estimated by computing the odds ratios (ORs) and 95% confidence intervals (CIs) from unconditional logistic regression analysis after adjusting for possible confounders. We used the Benjamini-Hochberg method to calculate the false discovery rate (FDR) and adjust the P value for multiple comparisons. The associations were considered statistically significant when the FDR-adjusted p values were <0.05.

| SNP statistics in RHD
The HapMap system was used to analyze the SNPs in RHD following the SNP selection principle described above. LD-Plus was used to supplement the common Haploview style plot by providing additional statistical context for LD statistics. Multiple dimensions of genomic data could be displayed for easy comparison to evaluate the SNP relationships. As observed using Haploview software, seven SNPs in the RHD region (rs592372, rs11485789, rs6669352, rs3118454, rs1053359, rs590787, and rs3927482) were evaluated as the tag-SNP of RHD (Figure 1).

SNPs in RHD in the weak D or DEL and RhD− groups
A genomic DNA assay based on a multiplex PCR coupled with a single base extension reaction (Silvy et al., 2011) was performed. This design allowed for simultaneous analysis of genotype and allele frequencies of the SNPs in RHD between the weak D or DEL and RhD− groups in the study populations. The general information of the donors and corresponding genotypes of RHD tag-SNPs are listed in Table 1. To avoid the interference of confounding factors such as age and gender, we adjusted the data using a logistic regression model. The weak D or DEL and RhD− appeared to be adequately matched for age (p = 0.173) and females (p = 0.526). The genotype distributions and allele frequencies of the SNPs in RHD are presented in Table 1. For rs592372, rs11485789, and rs6669352, only one genotype of each SNP was detected in our donors, which could not be used to distinguish weak D or DEL from RhD−. Genotype frequencies in the weak D or DEL and RhD− donors for the remaining SNPs were in accordance with the Hardy-Weinberg equilibrium model, where p = 0.569 for rs3118454, p = 0.195 for rs1053359, p = 0.413 for rs590787, and p = 0.432 for rs3927482. As shown in Table 1, the genotype frequencies of rs3118454 were 8.8%, 22.4%, and 68.8% for the CC, CT, and TT genotypes among the weak D or DEL and 2.2%, 35.7%, and 62.1% among RhD−, respectively. The difference between the weak D or DEL and RhD− was significant (CT: p = 0.001, CC: p = 0.022). Additionally, the combined CT/TT genotype frequency was significantly lower in the weak D or DEL group than in the RhD− group (91.2% vs. 97.9%, p = 0.008). When using the CC genotype as reference, we found that variant genotypes (CT and TT) were associated with a decreased trend compared to the TT genotype (p trend = 0.003).
For rs1053359, the genotype frequencies were 56.8%, 37.3%, and 5.9% for the GG, GC, and CC genotypes among weak D or DEL and 24.8%, 45.6%, and 20.0% among RhD−, respectively. The difference between the weak D or DEL and RhD− groups was significant (GC: p = 0.0001, CC: p < 0.001). Additionally, the combined GC/CC genotype frequency was lower in the weak D or DEL group compared to the RhD− group (43.2% vs. 50.8%, p < 0.0001). When using the TT genotype as a reference, we found that the variant genotypes (GC and CC) were associated with a lower frequency of CC compared to the TT genotype (adjusted OR = 2.82,95% CI [1.62,4.90] for GC,and OR = 11.06,95% CI [5.03,24.31] for CC; p trend < 0.001). Similarly, we observed that the combined GC/CC genotypes were associated with a significantly lower frequency compared to the TT genotype (OR = 4.04, 95% CI [2.46, 6.76]).
The genotype frequencies of rs590787 were 73.6% for the CC genotypes among the weak D or DEL group and 34.6% among the RhD− group, respectively. The difference between the weak D or DEL and RhD− groups was significant (CC: p < 0.001). For the genotype of rs3927482, the frequencies for TT were 93.6% among the weak D or DEL group and 78.4% among the RhD− group. There was a significant difference between weak D or DEL and RhD− for rs3927482. Taken together, these data suggest that the RHD SNPs rs3118454, rs1053359, rs590787, and rs3927482 are putative biomarkers for weak D or DEL.  Number represents the number of risk alleles within the combined genotypes; the risk alleles used for calculation were the rs3118454 T allele, rs1053359 G allele, rs590787 T allele and rs3927482 G allele. b Adjusted for age and sex in logistic regression model. c Two-sided χ 2 test for distribution between cases and controls.

| Frequency distributions of combined genotypes of SNPs between weak D or DEL and RhD− groups
Analysis of the allelic distribution revealed an association between rs3118454, rs1053359, rs590787, and rs3927482 with the phenotype of weak D or DEL (Table 1). To evaluate whether an interaction exists among these polymorphisms, we combined the four SNPs for analysis. As shown in Table 2

| DISCUSSION
In this study we investigated the potential role of the RHD in the phenotype of RhD. We identified seven SNPs in RHD and analyzed their frequencies in weak D or DEL and RhD− groups in a Chinese population. Rh cDNA was first described by Avent in 1990(Avent, Ridgwell, Tanner, & Anstee, 1990. Since then, many researchers have evaluated the antigens of the Rh system (Arce et al., 1993;Cherif-Zahar et al., 1990;Le van Kim et al., 1992). In 1991, Colin established the first structure of the Rh locus (Colin et al., 1991). Numerous studies have confirmed that most RhD− individuals lack RHD (Hua et al., 2010;Pandey, Gautam, & Shukla, 1995;Wang et al., 2009;Xu et al., 2003). The mechanisms for the loss of RHD include gene conversion (Innan, 2003;Shao, Li, Xiong, Zhou, & Li, 2005), gene deletion (Avent et al., 1997;Chang et al., 1998;Fichou et al., 2012;Wagner & Flegel, 2000), antithetical missense mutations, and other missense mutations (Fichou et al., 2012). These mechanisms are associated with SNPs.
We observed a significant association between the SNPs in RHD and weak D or DEL individuals, specifically SNP rs3118454, rs1053359, rs590787, and rs3927482. However, the other three polymorphisms showed no effect on the phenotype of weak D or DEL subjects. We also analyzed the effects of the four RHD SNPs together. The results showed a significant association between the combined genotypes and RhD−. Subjects carrying 1-5 variant alleles were less prevalent than those carrying 0 variant alleles in RhD−. These results suggest that the selected four SNPs of RHD affect the phenotype of weak D or DEL individuals in the Chinese population.
In conclusion, we found that the SNPs rs3118454, rs1053359, rs590787, and rs3927482 in RHD were significantly associated with RhD phenotype in a Chinese population.