A novel association between a SNP in CYBRD1 and serum ferritin levels in a cohort study of HFE hereditary haemochromatosis

Authors


Lyle Gurrin, Centre for MEGA Epidemiology, School of Population Health, University of Melbourne, Carlton, Vic. 3053, Australia. E-mail: lgurrin@unimelb.edu.au

Summary

There is emerging evidence that there are genetic modifiers of iron indices for HFE gene mutation carriers at risk of hereditary hemochromatosis. A random sample, stratified by HFE genotype, of 863 from a cohort of 31 192 people of northern European descent provided blood samples for genotyping of 476 single nucleotide polymorphisms (SNPs) in 44 genes involved in iron metabolism. Single SNP association testing, using linear regression models adjusted for sex, menopause and HFE genotype, was conducted for four continuously distributed outcomes: serum ferritin (log transformed), transferrin saturation, serum transferrin, and serum iron. The SNP rs884409 in CYBRD1 is a novel modifier specific to HFE C282Y homozygotes. Median unadjusted serum ferritin concentration decreased from 1194 μg/l (N = 27) to 387 μg/l (N = 16) for male C282Y homozygotes and from 357 μg/l (N = 42) to 69 μg/l (N = 12) for females, comparing those with no copies to those with one copy of rs884409. Functional testing of this CYBRD1 promoter polymorphism using a heterologous expression assay resulted in a 30% decrease in basal promoter activity relative to the common genotype (P = 0·004). This putative genetic modifier of iron overload expression accounts for 11% (95% CI 0·4%, 22·6%) of the variance in serum ferritin levels of C282Y homozygotes.

More than 80% of Caucasians with hereditary haemochromatosis (HH) are homozygous for the C282Y mutation in the HFE gene, but only 28% of men and 1% of women with that genotype develop iron overload-related disease (Allen et al, 2008). There is strong evidence that the moderate penetrance of C282Y homozygosity for iron-overload disease is partly due to modifying genetic factors. A high concordance of iron indices and/or iron-related disease has been found between related individuals (Crawford et al, 1993; Whiting et al, 2002; Njajou et al, 2006; McLaren et al, 2008), although this is also consistent with the presence of shared environmental factors that modify the iron phenotype. Bulaj et al (2000) reported that C282Y homozygotes who were relatives of symptomatic probands were more likely to have symptoms of HH disease (defined as cirrhosis, fibrosis grade 2 or above, aminotransferase above 1·2 of upper reference range in the absence of other cause, and radiographically confirmed arthropathy of the metacarpophalangeal joints). Two other studies of relatives of C282Y homozygotes showed at least the same prevalence of elevated iron indices and disease symptoms for those C282Y homozygote relatives identified through family screening as for those C282Y homozygotes ascertained via a proband presenting with symptoms (McCune et al, 2006; Powell et al, 2006). Stronger evidence supporting the presence of genetic modifiers was reported by Whitfield et al (2000) who studied a sample of both monozygotic and dizygotic twin pairs and showed that the pattern of residual variation in serum iron indices, after adjusting for an effect of the C282Y mutation, was consistent with the additive effects of multiple genes. After correcting for age and body-mass index, they estimated the proportion of variance explained by additive genetic factors, for men and women respectively, was 23% and 31% for iron, 66% and 49% for transferrin, 33% and 47% for transferrin saturation (TS), and 47% and 47% for ferritin.

Studies of inbred mice strains provide direct evidence of genetic contributions to iron homeostasis; a more than two-fold variation in biochemical measurements of iron status, including serum iron, TS, and hepatic iron, was found when mice were fed a basal iron diet (Leboeuf et al, 1995). Data from studies of Hfe knockout mice have shown that the degree of hepatic iron loading varies considerably depending on the genetic background on which the knockout is placed (Leboeuf et al, 1995; Sproule et al, 2001).

Since the HFE gene was identified in 1996 there have been substantial advances in the understanding of the molecular processes that control body iron loading. The liver-derived peptide hepcidin plays a central role in this process and has been shown to act on the small intestine to limit iron intake, and most iron loading disorders share the common feature of inappropriately low hepcidin levels (Ganz, 2008). HFE has been shown to be an upstream regulator of hepcidin, as have transferrin receptor 2 (TfR2), haemojuvelin (HJV), members of the BMP-SMAD signalling pathway and proinflammatory cytokines (Darshan & Anderson, 2009). Mutations in any of the genes encoding components of this regulatory network, or those facilitating iron transport across the intestinal epithelium, have the potential to alter iron loading.

A recent study by Milet et al (2007) of a cohort of 592 unrelated C282Y homozygous probands who attended the Liver Unit in Rennes, France since 1990 was the first to show strong evidence for an association between a measured genetic variant (a single nucleotide polymorphism [SNP] in the BMP2 gene; BMP2 is a stimulator of hepcidin expression) and the serum ferritin (SF) levels of C282Y homozygotes. BMP6 has recently been shown to be the key endogenous regulator of hepcidin (Andriopoulos et al, 2009; Camaschella, 2009; Meynard et al, 2009). Mutations in the TMPRSS6 gene (another upstream regulator of hepcidin) have been implicated in iron refractory iron deficient anaemia (Finberg et al, 2008; Guillem et al, 2008; Melis et al, 2008) through linkage studies, although these results are based on a few extended pedigrees and may have limited relevance at the population level. Mutations in many other genes are known to cause serious disruption of normal iron metabolism (e.g. HFE2, HAMP, SLC40A1, TFR2) but the causal mutations are very rare (Wallace & Subramaniam, 2007).

A recent study (Benyamin et al, 2009) investigated genetic influences on markers of iron status using a cohort of twins and their siblings and conducting a genome-wide association study (GWAS) on four serum markers of iron status (iron, transferrin, TS and ferritin). As well as confirming previously reported associations of HFE C282Y on all four markers, they found strong associations between a TMPRSS6 SNP (rs4820268) and serum iron, and several TF SNPs (rs3811647, rs1358024, rs4525863 and rs6794945 in the adjacent SRPRB gene) and serum transferrin. Although this study identified SNP associations worthy of further investigation, the analysis was unable to adjust for menopause status in women or to examine associations within all HFE genotype groups due to small sample sizes.

A major limitation to the understanding of genetic modifiers of iron indices has been a paucity of data derived from large, prospective population-based studies. Here we present results from a candidate gene study using a prospective cohort design where selection was stratified on the two most important HFE mutations, C282Y and H63D. A series of genes involved in iron metabolism were selected assuming that SNPs in these genes are likely to act as modifiers of iron phenotypes. SNPs from these genes were genotyped, and biochemical and clinical phenotypes were recorded prospectively before many participants were aware of their HFE status, in an HFE-stratified random sample of participants of northern European descent (Allen et al, 2008).

Material and methods

Study subjects

Between 1990 and 1994, 41 528 people (24 479 women) aged between 27 and 75 years were recruited for the Melbourne Collaborative Cohort Study (MCCS) (Giles & English, 2002); 99·3% were 40–69 years old. Recruitment was via the Australian Electoral Roll (both registration and voting are compulsory in Australia), advertisements and community announcements in local media. Participants attended a study centre where they were interviewed about a range of lifestyle and dietary factors, had physical measurements taken and provided a blood sample. For the present study, known as ‘HealthIron’, participants born in southern Europe were excluded, leaving 31 192 participants born in Australia, the United Kingdom, Ireland or New Zealand (i.e. almost exclusively of northern European ancestry).

Preliminary HFE genotyping of baseline samples

DNA from stored baseline samples was extracted from Guthrie cards (n = 23 484) using a Chelex method or from buffy coats (Corbett Buffy Coat CorProtocol 14102, Corbett, Sydney, Australia) which had been stored in liquid nitrogen (n = 7708) and genotyped for the C282Y and H63D alleles of the HFE gene using Taqman® real-time polymerase chain reaction (PCR) probes, Applied Biosystems, Carlsbad, California, USA. Genotyping for the H63D allele was performed only on samples from people heterozygous for the C282Y allele. The overall successful genotyping percentage (including missing samples) was 95% (29 676/31 192).

HealthIron clinics

Between 2004 and 2006, letters of invitation to participate in a HealthIron clinical assessment were sent to a sample of 1438 participants which included all C282Y homozygotes and a stratified random sample of participants from the remaining HFE genotype groups. Participants completed a computer-assisted personal interview and provided a cheekbrush sample for confirmatory HFE genotyping (Applied Biosystem 7000 real-time PCR with Taqman® probes, Applied Biosystems, Carlsbad, California, USA). Stored frozen serum collected in the morning at baseline MCCS attendance was used for measurement of serum iron indices (SF, serum transferrin, TS and serum iron).

All participants gave written, informed consent to participate in the MCCS and the HealthIron study. Both study protocols were approved by The Cancer Council Victoria’s Human Research Ethics Committee. Participants who provided a blood sample at the HealthIron clinic were genotyped for selected SNPs using DNA extracted from buffy coats.

Genotyping of SNPs from candidate genes

Tag SNPs were selected based on a candidate gene approach using resequencing data from a variety of sources and HapMap (Constantine et al, 2008). Resequencing of exonic regions in 11 genes was carried out on samples from 94 randomly selected (54 wild type, 22 H63D simple heterozygotes, 18 C282Y simple heterozygotes, two compound heterozygotes, two H63D homozygotes, and one C282Y homozygote) and 94 C282Y homozygous participants in HealthIron. Two other resequencing data sets were included: (i) the Hemochromatosis and Iron Overload Screening (HEIRS) ancillary study resequencing, completed by the Resequencing and Genotyping Service of the National Heart Lung and Blood Institute of samples from five populations for 14 genes (EMBL accession numbers are TFR2:DQ496110, TF: DQ525716, HFE2: DQ309445, SLC46A1: DQ496103, TFRC: DQ496099, PGRMC2: DQ496105, PGRMC1: DQ496104, IREB2: DQ496102, HEPH: DQ496100, HAMP: DQ496109, FTH1: DQ496108, FLVCR1: DQ496107, CYBRD1: DQ496101, and ACO1: DQ496106) and (ii) the Seattle SNPs database which has data for three genes of interest (TF, HMOX1, TNF). HapMap has SNP information across the entire human genome, although only data from phase 1 of the HapMap was available at the time of SNP selection. For genes that appeared in more than one of these data resources, sets of tag SNPs were identified separately from each resource and the union of these tag SNPs was selected for genotyping, providing a high level of redundancy to overcome the possibility of assay failure.

We combined these data from Caucasians to select 384 SNPs for further genotyping. The first round of genotyping was conducted using the Illumina Golden Gate platform for 384 SNPs. The second round SNPs were selected for new candidate genes along with some SNPs in high linkage disequilibrium (LD) with significant associations from the first round, as well as repeating five SNPs from the first round. These second round SNP assays were conducted using the Sequenom iPlex platform with four multiplexes of 35, 34, 33 and 23 SNPs. SNPs which failed to form good clusters, deviated from Hardy–Weinberg equilibrium (HWE) at P < 0·01, were monomorphic, or had less than five heterozygotes were excluded from further analysis.

Statistical methods

SF, TS, serum transferrin, and serum iron concentrations were analysed as continuously distributed outcomes, with SF transformed to the natural logarithm scale. Associations between serum iron indices and measured SNP genotypes were determined using separate linear regression models for each SNP, which was included as a three category exposure variable (zero, one, or two copies of the minor allele), along with three other covariates: HFE genotype (six categories: wildtype, C282Y simple heterozygotes, H63D simple heterozygotes, C282Y/H63D compound heterozygotes, H63D homozygotes, C282Y homozygotes), sex and menopause status for women (pre- or post-menopausal at baseline, post-menopausal included those who had a hysterectomy). Age was not included in these analyses because we found, as have others, that after adjusting for women’s menopausal status it had little association with iron indices (Koziol et al, 2001; Adams, 2008; Gurrin et al, 2008).

For each candidate SNP, we compared each of the regression models including that candidate SNP with the corresponding model that included only the three other covariates using the likelihood ratio test to determine if that SNP was associated with the outcome (Lettre et al, 2007). SNP genotype groups with five or fewer participants were excluded, so many of these tests were effectively comparisons between two predominant genotype groups with exclusion of the minor allele homozygous group. SNPs for which this comparison generated a P-value of <0·01, chosen to keep the expected number of false positives associations to less than five for each iron index, were retained for further investigation. The percentage of variation in an iron index explained by a given SNP was calculated as the difference in r2 values from linear regression models with and without that SNP as a covariate; 95% confidence intervals (CIs) for these quantities were calculated using 1000 bootstrap samples.

We used the Beagle (Browning & Browning, 2007) programme to test for associations between two binary outcomes (classifying SF or TS as above or below median values specific for groups defined by HFE genotype, sex and menopause) and SNP haplotypes (Appendix 1).

Functional CYBRD1 promoter testing

A 486-base pair fragment of the proximal human CYBRD1 promoter was amplified from genomic DNA derived from intestinal Caco-2 cells and cloned into the pGL3-basic luciferase reporter. The SNP rs884409 was generated by site-directed mutagenesis as per manufacturer’s protocol (QuickChange®; Stratagene, La Jolla, CA, USA), whereby the nucleotide guanine was substituted for thymine at position −326 relative to the transcriptional start site (+1). All constructs were sequenced. Wild-type and mutant reporter constructs were transiently transfected (Fugene 6, Roche, Basel, Switzerland) into Caco-2 cells and luciferase activity was measured 48 h later using the Dual-Luciferase Kit (Promega, Madison, WI, USA) and a luminometer.

Results

Genotyping

We genotyped SNPs in 44 candidate genes (see Table I) previously reported to have a known or hypothesized role in iron homeostasis. DNAs from 863 participants were genotyped: 121 C282Y homozygotes (56 male, 65 female), 148 C282Y/H63D compound heterozygotes (68 male, 80 female), 204 C282Y heterozygotes (96 male, 108 female), 104 H63D heterozygotes (41 male, 63 female), 12 H63D homozygotes (4 male, 8 female), and 274 with neither C282Y nor H63D (122 male, 152 female). These participants were genotyped for 384 SNPs using GoldenGate (Constantine et al, 2008) and 125 SNPs in four multiplexes using iPlex. Twenty-nine SNPs were excluded due to: 10 SNPs in the Golden Gate assay and 10 from iPlex did not form genotype clusters and thus were not able to be scored; two Golden Gate and one iPlex SNP were near monomorphic (less than five heterozygous participants) in our sample; three GoldenGate were monomorphic, and three iPlex SNPs deviated from HWE with P < 0·01. Four SNPs from round one were successfully repeated in the second round, so a total of 476 unique SNPs were scored. Table I shows the number of first and second round SNPs per gene, and the full SNP list is given in Appendix SI. Appendix 2 describes some minor genotyping errors discovered during the study affecting a small proportion (3–11%) of individuals for five SNPs (TFRC rs9846149; TNF rs1800630; HMOX1 rs9607267; TF rs4241357; CUBN rs7094474).

Table I.   Number of SNPs successfully genotyped per gene; the complete list of SNPs in given in Appendix SI.
 First roundSecond round
  1. *There are known HapMap phase 1 SNPs in these genes which were not genotyped, nor were they in strong linkage disequilibrium (r2 > 0·8) with any of the typed SNPs, i.e. tagSNP coverage was incomplete for these genes.

ACO115
BMP226
BMP417
BTBD91
CALR1
CD16314
CP25
CUBN*29
CYBRD1268
DHCR75
EXOC6*3
FLVCR1162
SLC25A3719
FTH17
FTL1
FXN6
GAST2
GDF154
GSTP11
HAMP5
SLC46A15
HEPH17
HEPHL19
HFE61
HFE241
HMOX1101
HMOX211
HP8
IREB217
MON1A2
PGRMC111
PGRMC23
PLEKHB22
SLC11A213
SLC40A112
SMAD16
SMAD410
SMAD54
STEAP3*10
TF432
TFR241
TFRC291
TMPRSS68
TNF28
Total369107

Single SNP association with four outcomes

At the significance level of P < 0·01, there were four SNPs associated with SF, four with TS, 24 with serum transferrin, and four with serum iron (Tables II, III, IV and V). Since some of the genotyped SNPs were in strong LD with other nearby SNPs, for each iron index we looked at SNPs in the same gene with P-values for association with that index of <0·01, and grouped those SNPs that were highly correlated with each other (r2 > 0·8). When each of these groups was considered to represent a single association, the number of unique associations with TS dropped from four to three and, for serum transferrin, dropped from 24 to 13. There were no changes in the number of associations for the other two iron indices.

Table II.   Single SNPs associated with serum ferritin in a regression model adjusting for sex, menopause and HFE genotype. Difference from mean (estimated regression coefficient) for common homozygotes is shown for heterozygotes and minor allele homozygotes (where N > 5).
SNPGeneMinor allele frequencyDifference heterozygoteDifference rare homozygoteLikelihood ratio test P-value
  1. *r2 between rs1284859 and rs931591 is 0·99.

rs7294582CD1630·1010·26 ± 0·10−0·78 ± 0·370·0021
rs1284859*FLVCR10·057−0·34 ± 0·130·0075
rs884409CYBRD10·165−0·24 ± 0·09−0·40 ± 0·230·0080
Table III.   Single SNPs associated with transferrin saturation in a regression model adjusting for sex, menopause and HFE genotype. Difference from mean (estimated regression coefficient) for common homozygotes is shown for heterozygotes and minor allele homozygotes (where N > 5).
SNPGeneMinor allele frequencyDifference heterozygoteDifference rare homozygoteLikelihood ratio test P-value
  1. *r2 between rs5756506 and rs4820268 is −0·70.

rs4820268* D521DTMPRSS60·462−3·98 ± 1·12−5·71 ± 1·390·0001
rs11254389CUBN0·167−2·39 ± 1·077·18 ± 2·770·0014
rs5756506*TMPRSS60·3720·52 ± 1·044·71 ± 1·490·0032
rs1049296 S589PTF0·163−0·21 ± 1·088·65 ± 2·840·0083
Table IV.   Single SNPs associated with serum transferrin in a regression model adjusting for sex, menopause and HFE genotype. Difference from mean (estimated regression coefficient) for common homozygotes is shown for heterozygotes and minor allele homozygotes (if N > 5).
SNP (Group with r2 > 0·8)*GeneMinor allele frequencyDifference heterozygoteDifference rare homozygoteLikelihood ratio test P-value
  1. *For groups of SNPs with r> 0·8, only one SNP per group is shown in the table:

  2. †rs1880669, rs2692695.

  3. ‡rs3811647, rs3811658, rs8177240, rs8177260, rs8177178, rs4459901, rs3811656.

  4. §rs1358024, rs8177297.

  5. ¶rs8177237, rs1405023.

  6. **rs8177185, rs8177215, rs8177235.

rs1880669†TF0·401−0·10 ± 0·03−0·21 ± 0·04<0·0001
rs3811647‡TF0·3210·12 ± 0·030·33 ± 0·05<0·0001
rs1358024§TF0·1540·09 ± 0·030·36 ± 0·08<0·0001
rs1405023¶TF0·477−0·08 ± 0·03−0·15 ± 0·040·0004
rs6488340CD1630·084−0·00 ± 0·040·54 ± 0·150·0014
rs12493168TF0·1420·08 ± 0·030·22 ± 0·080·0021
rs1799852 L247LTF0·099−0·11 ± 0·040·0023
rs8177326TF0·0220·21 ± 0·070·0024
rs8177215**TF0·052−0·13 ± 0·050·0039
rs11254389CUBN0·1670·10 ± 0·03−0·03 ± 0·080·0040
rs838102STEAP30·441−0·10 ± 0·03−0·04 ± 0·040·0056
rs1045537HFE0·0660·12 ± 0·04−0·17 ± 0·140·0083
rs1049296 S589PTF0·163−0·08 ± 0·03−0·17 ± 0·080·0093
Table V.   Single SNPs associated with serum iron in a regression model adjusting for sex, menopause and HFE genotype. Difference from mean (estimated regression coefficient) for common homozygotes is shown for heterozygotes and minor allele homozygotes (where N > 5).
SNPGeneMinor allele frequencyDifference heterozygoteDifference rare homozygoteLikelihood ratio test P-value
  1. *r2 between rs5756506 and rs4820268 was −0·70.

rs4820268* D521DTMPRSS60·462−1·89 ± 0·60 −3·10 ± 0·75 0·0001
rs5756506*TMPRSS60·3720·02 ± 0·56 2·53 ± 0·80 0·0028
rs7094474CUBN0·4260·24 ± 0·57−1·92 ± 0·750·0071
rs7219746GAST0·422−0·14 ± 0·57−2·11 ± 0·740·0086

The results from the Beagle analysis (Appendix 1) found 11 associations with permutation P-value <0·2. Four of the seven unique associations found in our regression analysis for TS and SF were also detected using Beagle.

C282Y homozygote-specific modifier

The SNP rs884409 is in the CYBRD1 promoter, 326 base pairs upstream from the transcription start site. A first round SNP rs3806566 was found to be associated with SF, examination of data from the National Heart, Lung and Blood Institute (NHLBI) and HapMap showed that it is in strong LD (r2 > 0·8) with 17 other variants across the gene. Within this group rs884409 is located in the promoter and therefore likely to be functional and thus was selected for second round typing. Although the summary statistics in Table VI suggest an association of rs884409 with SF that is consistent in direction across HFE genotype groups, there is evidence that the magnitude of association varies across HFE-genotype and sex groups (P = 0·0006) with the strongest association for C282Y homozygotes. For C282Y homozygotes the model r2’s were 41·2% (95% CI (25·4%, 57·0%)) and 30·6% (95% CI (13·9%, 47·3%)) with and without the SNP respectively, so this SNP explains 10·6% (95% CI (0·4%, 22·6%)) of the variation in SF.

Table VI.   Mean serum ferritin by CYBRD1 rs884409 genotype for six subgroups based on sex, menopause status and HFE C282Y homozygosity.
Copies of rs884409Serum ferritin (μg/l) geometric mean and 95% CI
No copiesOne copyTwo copies
C282Y homozygous men1047315 
(775, 1415)(134, 739)
n = 27n = 16
C282Y homozygous women post-menopause324150 
(223, 469)(30, 753)
n = 28n = 6
C282Y homozygous women pre-menopause100248
(45, 221)(5, 118) 
n = 14n = 6= 1
non-C282Y homozygous men177183152
(156, 202)(146, 230)(68, 343)
n = 201= 72n = 7
non-C282Y homozygous women post-menopause1068983
(89, 125)(68, 117)(34, 201)
n = 114n = 48n = 4
non-C282Y homozygous women pre-menopause433427
(35, 52)(25, 46)(17, 44)
n = 111n = 39n = 10

Functional testing of this promoter polymorphism using a heterologous expression assay found significantly (P = 0·004) decreased promoter activity compared with the more common genotype (Fig 1). A further SNP combination (rs2356782 and rs3731976 which occur together in exon 1 of CYBRD1 (Lee et al, 2002)) was also tested and showed an even greater decrease in promoter activity, but our data revealed no phenotypic association with these two SNPs.

Figure 1.

 Promoter activity without (wild-type) and with rs884409 SNP. Wild-type or mutant rs884409 SNP luciferase reporter constructs were transiently transfected into Caco-2 cells and relative luciferase activity (RLU) was measured 48 h later. rs884409 SNP is localized to nucleotide -326 relative to the TSS-1 and had a significant negative association with basal promoter activity (P = 0·0043). Data are representative of two independent experiments measured in triplicate, presented as means ± standard deviation and analysed using 1-way analysis of variance and Tukey’s post hoc test. Data is presented as % of Wild-type promoter activity set as 100.

Replication of previous studies

Our results confirm and extend the findings of Benyamin et al (2009), which showed that TMPRSS6 rs4820268 was associated with lower TS and lower serum iron, and rs3811647 and rs1358024, both in TF with r2 = 0·6, were associated with greater serum transferrin (Tables SI–SIV). By contrast, we did not replicate the previous finding of Milet et al (2007) which reported an association with BMP2 rs235756 and SF for C282Y homozygotes (Table SV).

Discussion

This report has provided evidence of a novel association between SF and CYBRD1 rs884409 using a prospective cohort study in which participants were unselected for either iron levels or disease status. CYBRD1 is a ferric reductase known to contribute to iron uptake (Latunde-Dada et al, 2008). Participants with one or two copies of the minor allele for this SNP had lower mean levels of SF than those with the more common genotype, after adjusting for HFE-genotype, sex and menopausal status for women. This association was determined using a regression modelling approach that effectively averaged the magnitude of the associations of this SNP on SF levels across HFE-genotype groups. The association was, however, stronger for C282Y homozygotes than for the other HFE-genotype groups. Results from a functional luciferase reporter assay showed that this SNP, which is localized to the proximal region of the CYBRD1 promoter, resulted in a 30% decrease in its basal activity. This SNP might thus decrease the basal expression of CYBRD1, and lower iron absorption by the gut, therefore protecting against the iron loading that would otherwise occur for C282Y homozygotes as they absorb excess iron.

The discovery of this modifier of the haemochromatosis-related phenotype has potential clinical relevance because SF is strongly correlated with body iron stores and is used routinely in clinical practice to indicate the extent of iron loading and to determine whether patients should be referred for liver biopsy (typically if SF > 1000 μg/l). Of male C282Y homozygotes with no copies of rs884409, 67% recorded a SF above 1000 μg/l, while for those with one copy of rs884409 only 21% had SF levels above 1000 μg/l. While a SNP in high LD with rs884409 (rs13009270, in our population r2 = 0·93) is on the Illumina 300K chip used by Benyamin et al (2009), the paucity of HFE C282Y homozygotes in their sample might explain why this SNP did not appear in their top ten SNPs associated with SF. The Benyamin et al (2009) GWAS study found the strongest associations were between SNPs in TF and serum transferrin. Serum transferrin is not usually considered a sensitive marker of iron status in itself, but is combined with measured serum iron to calculate the TS. With the exception of variants in HFE, the associations reported by Benyamin et al (2009) between genetic variants and TS were much weaker than those revealed for serum transferrin.

Our results confirm those of Benyamin et al (2009) in two respects. First, we confirmed two of the main novel findings of the earlier study (Benyamin et al, 2009), namely that rs4820268 in TMPRSS6 is associated with lower serum iron and TS and that two SNPs in TF (rs3811647 and rs1358024 with r2 = 0·6) are associated with increased mean levels of serum transferrin. We have extended their conclusions by showing that these associations occur within HFE genotype groups, in particular C282Y homozygotes. The previously reported association of rs1800562 (C282Y) with these four iron indices was also demonstrated by our study. None of the other top ten SNPs reported by Benyamin et al (2009) for their four outcomes (Table SIV) were tested in this study. Second, we found a number of associations between variants in TF and serum transferrin. We took a different approach to determining how many of these were effectively repeated occurrences of a single association due to strong LD between typed SNPs, by grouping SNPs with pairwise r2 values greater than 0·80 rather than invoking a stepwise regression with multiple SNPs. We found nine separate associations between SNPs in TF and serum transferrin from 20 associations in total, whereas Benyamin et al (2009) found just three from 16 initial associations.

Adjustment for multiple testing is complicated by the relatively high level of LD that remains for the SNPs typed in our study. One approach would be to calculate the effective number of independent or unique SNPs using methods based on principal components analysis (PCA) (Nyholt, 2004; Gauderman et al, 2007), which in our case reduces the number of SNPs from 476 to an effective 229. Bonferroni correction would result in no significant associations (with overall P value of 0·05) for SF, one for TS, three for serum transferrin, and one for serum iron. Screening at P = 0·01 for associations between each outcome and the almost 500 SNPs that were genotyped, would generate on average 500 × 0·01 = 5 associations assuming independent tests even if the global null hypothesis of no association with any SNP was true. The number of putative associations exceeded this benchmark for only one of the iron indices, serum transferrin.

Results from our study did not support the association reported by Milet et al (2007) between BMP2 rs235756 and SF. However, the two sample ascertainments and study designs are very different; Milet et al (2007) focussed on unrelated probands, generating a sample that they recognized ‘is not strictly representative of the population of C282Y homozygotes’ and ‘is rich in individuals with serious symptoms’. This suggests that the SNP identified in BMP2 may modify the SF levels of those C282Y homozygotes who have already progressed to clinical symptoms of disease.

In conclusion, evidence continues to accumulate indicating that SNPs in genes involved in the regulation of iron metabolism potentially modify the biochemical penetrance of the C282Y homozygous phenotype and explaining, at least in part, the large diversity of expression of iron indices for C282Y homozygotes. As further information on candidate genes is generated, these genetic modifiers may result in stratification of people homozygous for C282Y into those who are unlikely to ever progress to SF levels sufficiently high to cause iron-overload related disease through to those for whom continued monitoring is of benefit until the age of 55–60 years.

Acknowledgement

Supported by grant 1-RO1-DK061885-01A2 from the National Institutes of Health, and NHMRC project grants 251668 and 509174. The authors express their gratitude to the hundreds of Melbourne residents who participated in the HealthIron study.

Appendices

Appendix 1 – Beagle analysis

Haplotype analysis

We used the beagle software (Browning & Browning, 2007), which implements a graphical model based on variable length Markov chains to represent LD between SNPs, to test for association between a binary outcome and SNP haplotypes. Currently, Beagle accepts only binary outcome measures, so we defined a binary outcome for SF and TS by comparing the observed iron index to a subgroup specific median iron index, with subgroups defined by the same three covariates used in the regression modelling above, namely HFE-genotype, sex and menopausal status. The settings used were scale 3 and shift 0·2.

The analysis using Beagle supported three of the seven of single SNP associations from regression models described in the text [one for serum ferritin (SF), three for transferrin saturation(TS)] by generating P-values of <0·20 based on permutation tests (Table I).

Table I.   Association of SNP haplotypes with a binary outcome classifying SF or transferrin saturation (TS) as above or below median values specific for groups defined by HFE genotype, sex and menopause (in women).
OutcomeGeneSNP P-valueModelPermutation P-valueOther SNPs in high LD r2 > 0·8Single SNP associations from regression models
  1. *These represent haplotype associations rather than single SNP associations.

TSSLC46A1rs22399080·00073Recessive0·020rs2239907 
TSSLC46A1hcp10009420·00940Allelic0·139rs9905973 
TSCUBNrs112543890·00172Recessive0·199 yes
TS*EXOC6rs7876250·00453Overdominant0·069  
TS*SLC11A2rs4270200·00395Recessive0·078  
TSTMPRSS6rs480202680·00338Allelic0·126 yes
TSTMPRSS6rs57565060·00448Recessive0·161 yes
SFCYBRD1rs8844090·00030Allelic0·027rs3806566
rs12621138
rs13009270
rs17288294
yes
SF*HEPHL1rs15185630·00763Dominant0·180  
SFHFE2rs168270430·02639Dominant0·140  
SFHFE2hfe20067120·03137Allelic0·192  

Appendix 2 – Genotyping discrepancies

Examination of all pair wise combinations of single nucleotide polymorphisms (SNPs) within each gene (total 4516 pairs) found three pairs of SNPs with no compound heterozygotes, despite minor allele frequencies >0·05: TFRC rs9846149 and TFRC019335; TNF rs1800630 and rs1800610 (genotype counts shown in Table I); HMOX1 rs9607267 and rs11912889. Further evaluation of the genotype counts suggests the genotyping error (for compound heterozygotes only) occurred in the first SNP listed.

We hypothesize that this is due to an adjacent untyped SNP that is in high linkage disequilibrium (LD) with the second of the pair of SNPs, which interferes with the assay to act as a null allele when it occurred on the same chromosome as the tested SNP. This was supported in two of the three pairs where the minor allele homozygotes of the second SNP failed to be typed for the first SNP (N = 3 in TFRC; N = 5 in TNF). In the second round for TNF we typed a SNP (rs1799724) that is 6 base pairs from rs1800630 and, as expected by the interference hypothesis, it was in high LD with rs1800610 and showed no compound heterozygotes with rs1800630. For the two pairs with data available, HapMap data obtained from the Genome Variation Server also showed a deficit of compound heterozygotes (rs1800630 and rs1799724 observed 0 expected 6·2, and 7 rs1799724 rare homozygous failed to type for rs1800630; rs9607267 and rs11912889 observed 1, expected 12·5).

Examination of the distribution of the P-values for all SNP pairs reflecting the difference between the observed and expected number of compound heterozygotes showed excesses at each end of this distribution. For P-values above 0·95 the excess is expected due to LD, however for P-values below 0·05 there was an excess of c. 400 SNP pairs (9%) which show significant deficiencies of compound heterozygotes.

There were 46 SNPs (43 Golden Gate, 4 iPlex, 1 SNP both) which had some resequencing data (on a maximum of 188 participants) as well as custom genotyping. The highest number of discrepancies was 15/176 comparisons for rs4241357 in TF. Inspection of the Golden Gate scatter plot showed these 15 individuals were all within a second heterozygous cluster, however from resequencing they are minor allele homozygotes. The rs7094474 SNP was assayed twice (once by Golden Gate and once by iPlex) and showed a group of 96 discordant genotypes (heterozygotes by Golden Gate and common homozygotes by iPlex). This was sufficient to cause deviation from Hardy–Weinberg equilibrium (HWE) of the iPlex genotypes to fall into our exclusion criteria of P < 0·01. The other four described genotyping errors in subgroups did not cause HWE deviation to reach our exclusion criteria.

Overall, five SNPs (4/269 Golden Gate and 1/107 iPlex) showed mild discrepancies (involving 24–96 samples out of 863, i.e. 3–11%).

Table I.   Observed number of genotypes at a pair of TNF SNPs showing missing compound heterozygotes and failure of the rs1800630 assay for minor allele homozygotes at the rs1800630 locus.
TNFrs1800610Total
rs1800630Wild typeHeterozygotesMinor allele homozygotesMissing
Wild type5369600632
Heterozygotes193 003196
Minor allele homozygotes 191100 30
Missing 0 050 5
Total74810753863

Ancillary