Association of the variants and haplotypes in the DOCK7, PCSK9 and GALNT2 genes and the risk of hyperlipidaemia

Abstract Little is known about the association between the single nucleotide polymorphisms (SNPs) and haplotypes of the dedicator of cytokinesis 7 (DOCK7), pro‐protein convertase subtilisin/kexin type 9 (PCSK9) and polypeptide N‐acetylgalactosaminyltransferase 2 (GALNT2) and serum lipid traits in the Chinese populations. This study was to determine the association between nine SNPs in the three genes and their haplotypes and hypercholesterolaemia (HCH)/hypertriglyceridaemia (HTG), and to identify the possible gene–gene interactions among these SNPs. Genotyping was performed in 733 HCH and 540 HTG participants. The haplotype of C‐C‐G‐C‐T‐G‐C‐C‐G [in the order of DOCK7 rs1168013 (G>C), rs10889332 (C>T); PCSK9 rs615563 (G>A), rs7552841 (C>T), rs11206517 (T>G); and GALNT2 rs1997947 (G>A), rs2760537 (C>T), rs4846913 (C>A) and rs11122316 (G>A) SNPs] was associated with increased risk of HCH and HTG. The haplotypes of C‐C‐G‐C‐T‐G‐C‐C‐A and G‐C‐G‐T‐T‐G‐T‐C‐G were associated with a reduced risk of HCH and HTG. The haplotypes of G‐C‐G‐C‐T‐G‐C‐C‐A and G‐C‐G‐C‐T‐G‐T‐C‐G were associated with increased risk of HCH. The haplotypes of C‐T‐G‐C‐T‐G‐C‐C‐G, G‐C‐A‐C‐T‐G‐C‐C‐G and G‐C‐G‐C‐T‐G‐C‐C‐A were associated with an increased risk of HTG. The haplotypes of G‐C‐G‐C‐T‐G‐T‐C‐A and G‐C‐G‐T‐T‐G‐T‐C‐G were associated with a reduced risk of HTG. In addition, possible inter‐locus interactions among the DOCK7,PCSK9 and GALNT2 SNPs were also noted. However, further functional studies of these genes are still required to clarify which SNPs are functional and how these genes actually affect the serum lipid levels.


Introduction
Cardiovascular disease (CVD) is the major cause of premature death in both European [1] and American countries [2] and the rest of the world [3]. It is an important cause of disability [4] and contributes substantially to the escalating costs of health care [5]. Hyperlipidaemia-the risk factor for CVD [6] and related complications [7] leading to high morbidity and mortality [8]. The 2013 American College of Cardiology/American Heart Association Guideline on the Treatment of Blood Cholesterol to Reduce Atherosclerotic Cardiovascular Risk in Adults represents a major shift from prior cholesterol management guidelines [9]. The new guidelines introduce several major paradigm shifts, which include: aiming for atherosclerotic CVD risk reduction [10] as opposed to targeting low-density lipoprotein cholesterol (LDL-C) levels [11], and recommend an integrated approach to managing hyperlipidaemia to decrease atherosclerotic CVD risk [12]. Although lipid modification was mainly focused on reducing the LDL-C level in the past [13], lowering total cholesterol (TC) [14], triglyceride (TG) [15] and LDL-C levels were found to be more beneficial than lowering LDL-C alone [16]. Although the risk for hyperlipidaemia has largely been attributed to adult lifestyle factors [17] such as poor nutrition [18], lack of exercise [19] and smoking [20], there is now strong evidence suggesting that predisposition to the development of hyperlipidaemia begins with heredity [21]. It has been demonstrated that identifications of gene variants involved in hyperlipidaemia could provide a clue to search for novel pathogenesis and thereby new therapeutic or preventive methods for CVD.
Very large genome-wide association studies (GWAS) of hyperlipidaemia have identified few novel loci that appear to influence lipid metabolism [22][23][24], including the DOCK7 [25], PCSK9 [26] and GALNT2 [27] loci on chromosome 1. Assessment of the association between the DOCK7, PCSK9 and GALNT2 loci identified through GWAS [28][29][30] with the risk of hyperlipidaemia has become fundamental in the validation of these signals. DOCK7 (gene ID: 85440, MedGen: CN189147, OMIM: 615859) is located on chromosome 1p31.3 (Exon count: 53) and encodes for DOCK7 protein. The protein encoded by this gene is a guanine nucleotide exchange factor (GEF) that plays a role in axon formation [31] and neuronal polarization [32]. The encoded protein displays GEF activity towards RAC1 and RAC3 Rho small GTPases, but not towards CDC42 [33]. DOCK7 interaction with TACC3 controls interkinetic nuclear migration and the genesis of neurons from radial glial progenitor cells during cortical development [34]. Several transcript variants encoding different isoforms have been found for this gene [35]. PCSK9 (gene ID: 255738, MedGen: C1863551, OMIM: 603776) is located on chromosome 1p32.3 (Exon count: 14). This gene encodes a member of the subtilisin-like pro-protein convertase family, which includes proteases that process protein and peptide precursors trafficking through regulated or constitutive branches of the secretory pathway [36]. The encoded protein undergoes an autocatalytic processing event with its pro-segment in the ER and is constitutively secreted as an inactive protease into the extracellular matrix and trans-Golgi network [37]. It is expressed in liver, intestine and kidney tissues and escorts specific receptors for lysosomal degradation. It plays a role in cholesterol and fatty acid metabolism [38]. Mutations in this gene have been associated with autosomal dominant familial HCH [39]. GALNT2 (gene ID: 2590, OMIM: 602274) is located on chromosome 1q41-q42 (Exon count : 20). This gene encodes a member of the glycosyltransferase 2 protein family. Members of this family initiate mucin-type O-glycosylation of peptides in the Golgi apparatus. The encoded protein may be involved in O-linked glycosylation of the immunoglobulin A1 hinge region. This gene may influence TG levels, and may be involved in type 2 diabetes, as well as several types of cancer [40].
Although the association of some DOCK7, PCSK9 and GALNT2 SNPs and serum lipid levels has been reported in several previous studies, the association of the novel variants and their haplotypes and possible gene-gene interaction with the risk of hyperlipidaemia has never been detected previously. Therefore, this study was performed (i) to assess the association of the DOCK7 (rs1168013 and rs10889332), PCSK9 (rs615563, rs7552841 and rs1126517) and GALNT2 (rs1997947, rs2760537, rs4846913 and rs11122316) SNPs and serum lipid levels in individuals with HCH/HTG; (ii) to evaluate the association of their haplotypes with the risk of HCH/HTG and (iii) to identify the possible gene-gene interactions among these variants in the Chinese population.

Study populations
The participants were recruited from Dongxing City, Guangxi Zhuang Autonomous Region, People's Republic of China in 2012. A total of 1869 participants were randomly selected from our stratified, randomized samples [41]. There were 999 hyperlipidaemic (TC > 5.17 mmol/l and/or TG > 1.70 mmol/l) and 870 normolipidaemic (TC ≤ 5.17 mmol/l and TG ≤ 1.70) individuals, aged 18-80 years. The age and gender distribution were matched between the two populations. The participants with a history of CVD including coronary artery disease and stroke, diabetes, chronic illness including cardiac, renal, thyroid problems and/or a history of taking lipid-modulating medications such as statins or fibrates were excluded. Within the hyperlipidaemic population to assess the association of SNPs with risk of HCH and HTG separately, the hyperlipidaemic populations were subdivided into hypercholesterolaemic (TC > 5.17 mmol/l) and hypertriglyceridaemic (TG >1.70 mmol/l) groups. Informed consents were obtained from all the participants after they have received a full explanation of the study. The study was reviewed and approved by the Ethics Committee of the First Affiliated Hospital, Guangxi Medical University.

Epidemiological survey and biochemical measurements
The epidemiological survey was carried out by using internationally standardized methods and following a common protocol [42]. Information on demographics, socio-economic status, lifestyle, past medical history and family disease history was collected by using standardized questionnaires. The intake of alcohol was quantified as the number of liangs (about 50 g) of rice wine, corn wine, rum, beer or liquor consumed during the preceding 12 months. Alcohol consumption was categorized into groups of grams of alcohol per day: 0 (non-drinkers), ≤25 and >25. Smoking status was categorized into the groups of cigarettes per day: 0 (non-smokers), ≤20 and >20. The methods of blood pressure, height, weight and waist circumference measurements have been described in the previous studies. Fasting venous blood samples were taken and the levels of serum TC, TG, HDL cholesterol (HDL-C), and LDL-C in the samples were directly determined by enzymatic methods with commercially available kits, Tcho-1, TG-LH (RANDOX Laboratories Ltd., Crumlin Co. Antrim, UK), Cholestest N HDL, and Cholestest LDL (Daiichi Pure Chemicals Co. Ltd., Tokyo, Japan) respectively. Serum apolipoprotein (Apo) A1 and ApoB levels were assessed by the immunoturbidimetric assay by using a commercial kit (RANDOX Laboratories Ltd.). All determinations were performed with an autoanalyzer (Hitachi Ltd., Tokyo, Japan). The normal values of serum TC, TG, HDL-C, LDL-C, ApoA1 and ApoB levels and the ratio of ApoA1 to ApoB in our Clinical Science Experiment Centre were 3.10-5.17, 0.56-1.70, 1.16-1.42, 2.70-3.10 mmol/l, 1.20-1.60, 0.80-1.05 g/l and 1.00-2.50 respectively [43].

SNP selection and genotyping
We selected nine SNPs in the DOCK7, PCSK9 and GALNT2 with the following assumptions: (i) Tag SNPs, which were established by Haploview (version 4.2; Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA) or functional SNPs in functional areas of the gene fragments (http://www.ncbi.nlm.nih.gov/SNP/snp); (ii) a known minor allele frequency (MAF) higher than 1% in European ancestry from the Human Genome Project Database and (iii) the target SNP region should be adequately replicated by PCR, and the polymorphic site should have a commercially available restriction endonuclease enzyme cleavage site to be genotyped with the restriction fragment length polymorphism (RFLP).
Genomic DNA was isolated from peripheral blood leucocytes using the phenol-chloroform method [41]. Genotyping of nine SNPs was performed by PCR and RFLP. The characteristics of each SNP and the details of each primer pair, annealing temperature, length of the PCR products and corresponding restriction enzyme used for genotyping are summarized in Tables S1 and S2. The PCR products of the samples (two samples of each genotype) were sequenced with an ABI Prism 3100 (Applied Biosystems, International Equipment Trading Ltd., Vernon Hills, IL, USA) in Shanghai Sangon Biological Engineering Technology & Services Co. Ltd., Shanghai China.

Statistical analysis
The statistical analyses were performed with the statistical software package SPSS 19.0 (SPSS Inc., Chicago, IL, USA). The quantitative variables were presented as the mean AE S.D. for those, that are normally distributed, and the medians and interquartile ranges for TG, which is not normally distributed. General characteristics between the two groups were compared by the Student's unpaired t-test. The allele frequency and genotype distribution, as well as haplotype frequency between the groups were analysed by the chi-squared test; and the Hardy-Weinberg equilibrium was verified with the standard goodnessof-fit test. Pair-wise linkage disequilibria and haplotype frequencies among the SNPs were analysed using Haploview (version 4.2; Broad Institute of MIT and Harvard). The association between the genotypes and serum lipid parameters was tested by ANCOVA. Any variants associated with the serum lipid parameter at a value of P < 0.005 (corresponding to P < 0.05 after adjusting for 9 independent tests by the Bonferroni correction) were considered statistically significant. Unconditional logistic regression was used to assess the correlation between the risk of hyperlipidaemia and genotypes (DOCK7 rs1168013: GG = 1, CG = 2, CC = 3; rs10889332: CC = 1, CT = 2, TT = 3; PCSK9 rs615563: GG = 1, AG = 2, AA = 3; rs7552841: CC = 1, CT = 2, TT = 3; rs11206517: TT = 1, GT = 2, GG = 3; GALNT2 rs1997947: GG = 1, AG = 2, AA = 3; rs2760537: CC = 1, CT = 2, TT = 3; rs4846913: CC = 1, AC = 2, AA = 3 and rs11122316: GG = 1, AG = 2, AA = 3). Age, sex, body mass index (BMI), smoking and alcohol consumption were adjusted for the statistical analysis. Two-sided P < 0.05 was considered statistically significant.
The inter-locus interaction was analysed by generalized multifactor dimensionality reduction (GMDR) method, using GMDR software. The cross-validation consistency score provides the degree of consistency when the selected interaction is identified as the best model among all possibilities considered. The testing balanced accuracy provides the degree of interaction, which accurately predicts the case-control status with scores between 0.50 (indicating that the model predicts no better than the chance) and 1.00 (indicating perfect prediction). A sign test or a permutation test provides P-value for predicting accuracy to measure the significance of an identified model. The best model is selected as the combination of marker with maximum cross-validation consistency and minimum prediction error.

Characteristics of the studied populations
Tables 1 and 2 compare the general characteristics and serum lipid levels between the HCH and non-HCH populations and between the HTG and non-HTG populations respectively. Both HCH and HTG individuals had significantly higher anthropometric parameters than their control individuals (P < 0.05-0.001). The age and gender distribution, height, pulse pressure and the % of participants who smoked and consumed alcohol were not different between both the HCH and HTG individuals (P > 0.05). There was no difference in the level of systolic blood pressure between the HTG and non-HTG populations (P > 0.05).

Haplotypes and the risk of hyperlipidaemia
As shown in Table 7, the haplotype of G-C-G-C-T-G-C-C-G [in the order of DOCK7 rs1168013 (G>C), rs10889332 (C>T); PCSK9 rs615563 (G>A), rs7552841 (C>T), rs11206517 (T>G); and GALNT2 rs1997947 (G>A), rs2760537 (C>T), rs4846913 (C>A) and rs11122316 (G>A) SNPs] was the most common haplotype and represented >10% of the samples. The haplotype of C-C-G-C-T-G-C-C-G was associated with increased risk of HCH (OR: 3.29, 95% CI: 1.81, 6.00, P < 0.001) and HTG (OR: 3.99, 95% CI: 1.81, 8.77, P < 0.001). The haplotypes of G-C-G-C-T-G-C-C-A and G-C-G-C-T-G-T-C-G were associated with increased risk of HCH (OR: 1.68, 95% CI: 1.15, 2.46,    Table 8 shows the impacts of combination among the DOCK7, PCSK9 and GALNT2 SNPs, which were analysed by GMDR. The two-and three-locus models showed a significant association with the risk of HCH and HTG (P < 0.01-0.001). The two-locus model was chosen as the best one, owing to the fact of having the highest level of testing accuracy (54.71%) for HCH and good cross-validation consistency (7/10).The three-locus model was chosen as the best one, owing to the fact of having the highest level of testing accuracy (59.00% for HTG) and good cross-validation consistency (9/10).

Discussion
The main findings of this study encompass (i) the associations of the DOCK7, PCSK9 and GALNT2 SNPs with serum lipid levels in individuals with HCH and HTG; (ii) the correlation of their haplotypes with HCH/HTG and (iii) possible gene-gene interaction among these variants to influence HCH/HTG. This is the first report on the inter-locus interaction among the DOCK7, PCSK9 and GALNT2 SNPs on serum lipid levels. The observed allele frequencies of the remaining nine SNPs in the non-HCH/non-HTG populations were consistent with those of the International Hapmap Chinese Han Beijing sample (http://hapmap.ncbi.nlm.nih.gov/cgi-perl/gbrowse/ha pmap27_B36/).
Recently, a couple of previous reports found that the individuals with transferability and fine mapping of genome-wide-associated loci, DOCK7 rs2131925-T-allele, was associated with serum TC levels in African-Americans [44], genetic loci rs10889353-C-allele was correlation with TC and TG levels in the Chinese population [45], and rs636523-T-allele near DOCK7 was related to plasma TG levels in the Jackson Heart Study [25]. Likewise, in some population's large-scale association studies, the PCSK9 rs17111557-T-allele carriers had lower HDL-C than the C-allele carriers in Brazilians [46], common variants of rs12067569 and rs505151 in PCSK9 were significantly associated with higher LDL-C and for rare variants rs11591147 (R46L, MAF = 0.9%) was associated with lower LDL-C in American-Indians [38], and the E670G SNP in the PCSK9 was associated with polygenic HCH in men, but not in women [47]. Moreover, in a largescale GWAS, the GALNT2 variants were associated with quantitative change in serum lipid levels. In particular, GALNT2G allele frequency of rs4846914 showed correlation with TG levels in the Korean populations [48] and no correlation with TG levels in healthy Roma and Hungarian populations [49], segregation of GALNT2 D314A mutations in Caucasian families with extremely high HDL-C [50], and heterozygosity for a loss-of-function mutation in GALNT2 improves plasma TG clearance in man [51]. In the present study, we found that the alleles of rs10889332-T, rs615563-A, rs7552841-T, rs1997947-A, rs2760537-T and rs4846913-A were more frequent in HCH/HTG than in non-HCH/non-HTG populations. The alleles of rs11206517-G and  rs11122316-A were more frequent just in HTG than in non-HTG populations. The levels of TC (rs10889332 and rs7552841), TG (rs10889332, rs7552841, rs11206517, rs1997947, rs4846913 and rs11122316), HDL-C (rs1168013, rs11206517, rs1997947 and rs4846913), LDL-C (rs7552841 and rs1997947), ApoA1 (rs10889332, rs1997947 and rs4846913), ApoB (rs1168013, rs10889332 and rs7552841) and the ratio of ApoA1 to ApoB (rs1168013, rs10889332 and rs7552841) in the hypercholesterolaemic participants were different between the three genotypes (P < 0.005-0.001), whereas the levels of TC (rs1997947 and rs2760537), TG (rs10889332, rs615563, rs7552841, rs1997947, rs4846913 and rs11122316), ApoB (rs615563, rs7552841 and rs1997947) and the ratio of ApoA1 to ApoB (rs4846913) in the normocholesterolaemic individuals were different between the three genotypes. Likewise, the levels of TG (rs1168013, rs10889332 and rs7552841), ApoA1 (rs4846913) and the ratio of ApoA1 to ApoB (rs10889332) in the hypertriglyceridaemic population were different between the genotypes, whereas the levels of TC (rs1088933, rs615563 and rs7552841), TG (rs10889332, rs615563, rs1997947, rs2760537, rs4846913 and rs11122316) and HDL-C (rs1168013, rs615563, rs11206517, rs1997947 and rs4846913), LDL-C (rs10889332 and rs7552841), ApoA1 (rs1997947 and rs4846913), ApoB (rs10889332, rs615563, rs7552841 and rs11206517) and the ratio of ApoA1 to ApoB (rs615563, rs7552841, rs11206517 and rs1997947) in the normotriglyceridaemic population were different between the genotypes. The reason for these discrepancies among the studies is not fully understood. The differences in the genetic background, linkage disequilibrium pattern and/or environmental factors may partly explain these discrepancies.
Alirocumab, an inhibitor of PCSK9, significantly reduced levels of LDL-C when added to statin therapy administered at the maximum tolerated dose [52]. Current guidelines suggest high-intensity statin treatment for most high-risk patients [53]. However, only 47% of the study patients were receiving high-dose statins, resulting in a mean baseline LDL-C level of 122 mg/dl. Treatment with high-dose statins would have brought a much higher percentage of patients in the placebo group to the goal of an LDL-C level of less than 70 mg/dl [54]. In addition, appropriate use of high-dose statins would have been associated with a lower rate of major adverse cardiovascular events in the placebo group [55,56]. Thus, a strategy of not exploiting the maximum potential of statins in high-risk patients may have overestimated the benefit of PCSK9 inhibition. The efficacy and safety of the PCSK9 inhibitor, alirocumab, in reducing lipids and cardiovascular events may be influenced by these above SNPs. It is expected that the association of genetic susceptibility of PCSK9 polymorphisms and the lipid-lowering efficacy of alirocumab treatment in the levels of LDL-C will be elucidated in a not too distant future. What is more, the participants with a history of taking lipid-modulating medications such as statins, fibrates or PCSK9 inhibitors were excluded in present study. But, the associations between the above genes and serum lipid levels and lipid-lowering efficacy of treatment are also needed to further explore, especially, when using LDL-C and TG levels to divide groups.
When assessing the association of the DOCK7, PCSK9 and GALNT2 SNPs and the risk of hyperlipidaemia, this study showed that although the variants of DOCK7 rs1168013, PCSK9 rs2760537 and GALNT2 rs11122316 did not reach statistically significant association with HCH/HTG risk, they, in moderation with other SNPs, achieved significant association with the risk of HCH/ HTG. In addition, we noticed that the haplotype of C-C-G-C-T-G-C-C-G, carrying rs11122316-G-allele, was associated with an increased risk of HCH and HTG. The haplotypes of C-C-G-C-T-G-C-C-A and G-C-G-T-T-G-T-C-G were associated with reduced risk of HCH and HTG. The haplotypes of G-C-G-C-T-G-C-C-A and G-C-     G-C-T-G-T-C-G were associated with an increased risk of HCH.
The haplotypes of C-T-G-C-T-G-C-C-G, G-C-A-C-T-G-C-C-G and G-C-G-C-T-G-C-C-A were associated with an increased risk of HTG. The haplotypes of G-C-G-C-T-G-T-C-A and G-C-G-T-T-G-T-C-G were associated with a reduced risk of HTG. On GMDR analysis, an inter-locus interaction among the DOCK7, PCSK9 and GALNT2 SNPs on serum lipid levels was found in this study. The interactions of rs10889332-rs1997947 were associated with the risk of HCH, and rs615563-rs7552841, and/or rs615563-rs7552841-rs4847913 were associated with the risk of HTG. In multi-locus (GMDR) analyses, a significant association with HCH and HTG was found in two-to three-locus models. These findings indicate that a potential gene-gene interaction might exist among the DOCK7, PCSK9 and GALNT2 SNPs. Unfortunately, no previous study has investigated the inter-locus interaction between these SNPs, and therefore we cannot make comparisons with our results. Although, a statistically significant SNP-SNP interaction was noted in this study, the biological mechanism underlying these genes and their interactions is still yet to be defined.

Study limitations
There are several potential limitations in our study. First, the number of participants available for MAF of some SNPs was not high enough to calculate a strong power as compared with many previous GWAS and replication studies. Hence, further studies with larger sample size are needed to confirm our results. Second, we were unable to alleviate the effect of diet during the statistical analysis. Third, although we have detected the interactions of the DOCK7, PCSK9 and GALNT2 SNPs on hyperlipidaemia in this study, many unmeasured environmental and genetic factors still need to be considered. Besides, the interactions of gene-environment and environment-environment on serum lipid levels remain to be determined. For the clear understanding of biological mechanism underlying hyperlipidaemia, an enormous amount of common variants with small effects and rare variants with large effects still remain to be determined. What is more, the relevance of this finding has to be defined in further high calibre of studies including incorporating the genetic information of the DOCK7, PCSK9 and GALNT2 SNPs and their haplotypes and in vitro functional studies to confirm the impact of a variant on a molecular level.

Conclusions
Our study confirmed that the genetic variants are replicable in the Southern Chinese hyperlipidaemic and normolipidaemic populations. The haplotype of C-C-G-C-T-G-C-C-G was associated with an increased risk of HCH and HTG. The haplotypes of C-C-G-C-T-G-C-C-A and G-C-G-T-T-G-T-C-G were associated with a reduced risk of HCH and HTG. The haplotypes of G-C-G-C-T-G-C-C-A and G-C-G-C-T-G-T-C-G were associated with an increased risk of HCH. The haplotypes of C-T-G-C-T-G-C-C-G, G-C-A-C-T-G-C-C-G and G-C-G-C-T-G-C-C-A were associated with an increased risk of HTG. The haplotypes Table 7 The association between the DOCK7, PCSK9 and GALNT2 haplotypes and hypercholesterolaemia/hypertriglyceridaemia of G-C-G-C-T-G-T-C-A and G-C-G-T-T-G-T-C-G were associated with a reduced risk of HTG. In addition, possible inter-locus interactions among the DOCK7, PCSK9 and GALNT2 SNPs are also noted. However, further functional studies of these genes are still required to clarify which SNPs are functional and how these genes actually affect the serum lipid levels. Taken all of facts into consideration, it is possible that the significant SNPs identified in the DOCK7, PCSK9 and GALNT2 region might be in high linkage disequilibrium with some of the functional SNPs in other genes, which is known to affect the lipid metabolism. Thus, an in-depth study of the biological actions of these genes is crucial to clarify which SNPs are functional and how these genes actually affect the serum lipid levels. It is expected that the physiological function of DOCK7, PCSK9 and GALNT2 will be elucidated in a not too distant future.