Bilirubin is an endogenous metabolite of haem that possesses potent natural antioxidant effects . Of interest, elevated serum bilirubin concentration is strongly associated with protection against many immune and inflammatory diseases including cancer , rheumatoid arthritis , nonalcoholic fatty liver disease  and diabetes mellitus , as well as ischaemic heart disease (IHD) and myocardial infarction (MI) [6, 7]. However, it remains unclear whether bilirubin directly mediates disease protection or is simply a biomarker for other risk factors or causal mechanisms. The distinction is important: if bilirubin is merely a correlated biomarker, it might only serve a limited prognostic or diagnostic role for IHD; whereas, if it is mechanistically involved in IHD causation, it could serve as a target for new therapies.
In this issue of the Journal of Internal Medicine, Stender et al. attempt to decipher the nature of the association between total bilirubin concentration and IHD using a statistical approach known as Mendelian randomization (MR) . A total of 67 608 European participants including 11 686 IHD patients from three population-based studies were genotyped for a genetic variant in UGT1A1 encoding UDP-glucuronosyltransferase 1A1 that was previously associated with total bilirubin concentration in healthy subjects. They demonstrate that one particular UGT1A1 genotype increased total bilirubin concentration maximally by ~95% (P < 0.001). If bilirubin directly acted in pathways affecting IHD progression, individuals with this UGT1A1 genotype would surely be protected. However, the authors found that multifactorially adjusted total bilirubin concentration was not associated with protection against IHD [odds ratio (OR) = 0.93; P = 0.29] or MI (OR = 0.90; P = 0.35), nor was UGT1A1 genotype associated with either IHD (OR = 1.03; P = 0.73) or MI (OR = 1.01; P = 0.68). These data strongly suggest that total bilirubin is not causally associated with protection against IHD.
MR is gaining recognition amongst genetic epidemiologists for its ostensible ability to evaluate causality between disease risk factors and end-points . The rationale behind MR is that genetic variation that is associated with a risk factor may serve as a proxy for exposure to varying doses of that metabolic trait. Thus, an association between a genetic variant and both the risk factor and disease end-point implies a causal relationship between the risk factor and disease end-point (Fig. 1, middle). The primary advantage of the MR study design is that it is robust to confounding environmental variables and reverse causation [10, 11]. Parental alleles of any given genetic variant are transmitted to offspring according to Mendel's law of independent assortment, effectively randomizing genotypes at conception. This process is akin to randomization of treatment groups within a randomized controlled trial (RCT), thus reducing confounding from potential environmental exposures. Furthermore, reverse causation cannot confound the MR-based association because alleles are fixed at birth and temporally precede disease onset. Thus, when a causal association exists between a risk factor and disease end-point, the genetic variant that governs that risk factor should have a commensurate association with the disease end-point.
MR is a biological variation of instrumental variable analysis, which is used for causal inference in the field of econometrics. The application of MR to biomedicine was proposed by Katan in 1986, who suggested using APOE isotype as a surrogate for serum cholesterol concentration to evaluate the suspected causal association between low cholesterol and increased cancer risk . He reasoned that subjects carrying APOE E2/E2 alleles who were genetically predisposed to lower cholesterol concentrations compared with those with APOE E4/E4 alleles would be expected to have increased cancer rates if low cholesterol concentration indeed caused cancer. This clever approach showed that whilst APOE genotypes were associated with total and low-density lipoprotein (LDL) cholesterol concentrations, they were not associated with cancer risk, thus refuting the suggested causal relationship. This conclusion has since been upheld by several RCTs of statin therapy showing no relationship between pharmacological LDL-cholesterol–lowering and cancer risk .
Recent genome-wide association (GWA) studies have reported hundreds of genetic variants that are reliably associated with disease risk factors, similar to the association between APOE genotype and plasma cholesterol concentration. The robust genotype–phenotype associations from GWA studies serve as ideal instruments for MR analysis to assess causality between other controversial risk factors and disease end-points. For instance, recent MR studies have supported causal roles in cardiovascular disease (CVD) for both plasma triglycerides  and lipoprotein(a) [15, 16]; they have also contradicted a causal relationship between serum C-reactive protein and CVD [17, 18]. Most recently, MR has undermined the findings of nearly 30 years of in vivo and in vitro research demonstrating a protective role for high-density lipoprotein (HDL) in reverse cholesterol transport , by suggesting that genetically mediated increases in HDL cholesterol concentration are not causally associated with atheroprotection [20, 21].
Although MR is a persuasive statistical approach, the integrity of the analysis requires several key assumptions: that genetic variants are (i) independent of other variables that interact with either the risk factor or end-point, (ii) reliably associated with the risk factor, and (iii) associated with the end-point only through a single risk factor. However, the intrinsic complexity of biological systems makes satisfying these criteria difficult [10, 11]. For example, the UGT1A locus is complex, expressing nine distinct coding sequences (one of which is UGT1A1) each with three possible splicing isoforms. Linkage disequilibrium (LD) surrounding this locus is not insignificant, meaning that genetic variation in UGT1A1 may simultaneously act as a proxy for other unknown genetic variants at this locus with their own protective or harmful effects on CVD. Furthermore, UGT1A1 is a pleiotropic gene whose product is known to metabolize many different hormones, toxins and drugs . These biological complexities could both lead to confounding effects that would violate the basic assumptions of MR (Fig. 1). Unmeasured confounding effects opposite to those of the measured risk factor might negate a causal association where one exists, whereas unmeasured parallel effects might mimic a causal association where one does not exist. Thus, it is difficult to knowingly satisfy all MR assumptions, making it challenging to demonstrate with absolute certainty that a causal association exists between a disease risk factor and end-point.
Despite these limitations, MR studies appear to have gained significant influence over established experimental approaches; but when did MR become the definitive test of causal inference in human disease? Historically, the first postulates regarding causal inference were proposed in 1884 by Henle and Koch to define the pathogenicity of microorganisms . The Hill criteria were later proposed in 1965 to provide a preliminary basis for objective evaluation of the role of risk factors in disease pathophysiology . Both the Henle–Koch postulates and the Hill criteria have long served as the guiding principles for causal inference in biomedical research; however, neither were proposed to be the final arbiters of causality, but were simply intended to direct biological characterization and hypothesis generation. By comparison, recent MR studies seem to have become the de facto be-all-and-end-all of causality.
Using genetic epidemiology, Stender et al. demonstrate convincingly that total bilirubin is not causally associated with IHD or MI; however, recent in vitro experimental studies have provided apparently conflicting results. For example, hyperbilirubinaemia is characteristic of Gilbert's syndrome, a benign condition affecting 5–10% of Europeans in which a common genetic variant in the UGT1A1 promoter decreases expression of the UDP-glucuronosyltransferase required for conjugation and elimination of bilirubin. It has been shown that plasma from individuals with Gilbert's syndrome has an increased serum antioxidant capacity, increased concentration of reduced thiol and glutathione, elevated HDL:LDL ratio and reduced proatherogenic oxidized LDL and small dense LDL [25-27]. Assuming a noncausal relationship between bilirubin and IHD, these studies would collectively imply that the cardioprotective phenotypes observed in those with Gilbert's syndrome are not mediated by elevated bilirubin concentration or even genetic variation in UGT1A1, unless these phenotypes do not in fact mediate measurable cardioprotection in individuals. Rather, they would suggest that other genes and mechanisms underlie the atheroprotective phenotypes observed in Gilbert's syndrome. The apparent discrepancy between these experimental studies and the current MR study will require further evaluation.
In summary, recent MR studies have cast a long shadow over causal mechanisms derived from traditional biological experiments. Stender et al. have provided important evidence supporting one aspect of the association between total bilirubin and IHD. However, the final relevance of MR in the context of other experimental evidence remains unclear. MR studies can provide important information about exposure to a risk factor and development of disease, without confounding and reverse causation; however, it may not be possible to definitively satisfy the assumptions of MR even in well-elucidated biological systems. As such, MR studies may be important for hypothesis generation, especially in poorly understood biological systems, but whether they should automatically negate a whole body of evidence generated by previous experiments using different study designs is debatable. We believe that whilst the power of MR should be embraced, one should nonetheless be careful about using this approach to dismiss the findings of carefully acquired experimental evidence regarding the causality of disease.