Mendelian Randomization Analysis With Multiple Genetic Variants Using Summarized Data
ABSTRACT
Genome‐wide association studies, which typically report regression coefficients summarizing the associations of many genetic variants with various traits, are potentially a powerful source of data for Mendelian randomization investigations. We demonstrate how such coefficients from multiple variants can be combined in a Mendelian randomization analysis to estimate the causal effect of a risk factor on an outcome. The bias and efficiency of estimates based on summarized data are compared to those based on individual‐level data in simulation studies. We investigate the impact of gene–gene interactions, linkage disequilibrium, and ‘weak instruments’ on these estimates. Both an inverse‐variance weighted average of variant‐specific associations and a likelihood‐based approach for summarized data give similar estimates and precision to the two‐stage least squares method for individual‐level data, even when there are gene–gene interactions. However, these summarized data methods overstate precision when variants are in linkage disequilibrium. If the P‐value in a linear regression of the risk factor for each variant is less than
, then weak instrument bias will be small. We use these methods to estimate the causal association of low‐density lipoprotein cholesterol (LDL‐C) on coronary artery disease using published data on five genetic variants. A 30% reduction in LDL‐C is estimated to reduce coronary artery disease risk by 67% (95% CI: 54% to 76%). We conclude that Mendelian randomization investigations using summarized data from uncorrelated variants are similarly efficient to those using individual‐level data, although the necessary assumptions cannot be so fully assessed.
Introduction
Mendelian randomization is a technique for using genetic variants to estimate the causal effect of a modifiable risk factor from observational data [Davey Smith and Ebrahim, 2003]. It has recently been used to strengthen the evidence for causal roles in coronary heart disease of interleukin‐6 [Swerdlow et al., 2012] and lipoprotein(a) [Kamstrup et al., 2009]. A limitation of Mendelian randomization is that genetic variants often only explain a small fraction of the variation in the risk factor of interest [Davey Smith and Ebrahim, 2004], so that assessing some causal associations requires sample sizes running into tens of thousands to obtain adequate power [Schatzkin et al., 2009]. This problem can be partially redressed by the use of multiple genetic variants [Palmer et al., 2011]. If each variant explains additional variation in the risk factor, then a combined causal estimate using all of the variants will have greater precision than the estimate from any of the individual variants [Pierce et al., 2011].
One potential source of such data is genome‐wide association (GWA) studies, which examine the associations of many genetic variants with a trait. Many large GWA study consortia have been assembled, with sample sizes in some cases running into hundreds of thousands [Ehret et al., 2011]. Individual‐level data on study participants are not always available due to issues of practicality and confidentiality of data‐sharing on such a large scale. Presentations of results from GWA studies often report the summary associations of all variants that have reached a certain P‐value threshold, and recently the release of association estimates in published GWA studies for all measured variants has been advocated [Editorial, 2012]. We investigate methods for using these summarized genetic associations with a risk factor and an outcome to estimate the causal effect of the risk factor on the outcome.
- it is associated with the risk factor,
- it is not associated with any confounder of the risk factor–outcome association,
- it is conditionally independent of the outcome given the risk factor and confounders.
A variant satisfying these assumptions is known as an instrumental variable (IV) [Greenland, 2000]. With a single genetic variant used as an IV and a continuous outcome, assuming all associations are linear, the causal effect of the risk factor on the outcome can be estimated as the ratio of the change in the outcome per additional variant allele divided by the change in the risk factor per additional variant allele [Thomas and Conti, 2004]. With individual‐level data, each of these changes can be estimated using linear regression. For a binary outcome, such as disease, a log‐linear or other appropriate regression model can be used in the regression of the outcome on the variant [Didelez et al., 2010]. If summarized (aggregated) data are available in the form of these regression coefficients, the ratio estimate of the causal effect can be calculated without recourse to individual‐level data [Harbord et al., 2013]. However, with multiple variants, it is not clear how to integrate these genetic association estimates together into a single estimate of the causal effect.
Methods
We assume that summarized data are available for multiple genetic variants that are single nucleotide polymorphisms (SNPs) and satisfy the IV assumptions for the risk factor of interest X and the outcome Y. Genetic variant k,
is associated with an observed
mean change in the risk factor per additional variant allele with standard error
and an observed
mean change in the outcome per allele with standard error
(if the outcome is binary,
could represent the per allele change in the log‐odds or the log‐probability of an outcome). Two methods are presented for the estimation of a causal effect using summarized data. The first, which has been previously used in applied investigations, combines the ratio estimates from the individual variants employing inverse‐variance weights [Ehret et al., 2011]. The second is a novel likelihood‐based method, with independent likelihood contributions from each of the variants.
Inverse‐Variance Weighted Combination of Ratio Estimates
. The standard error of the ratio estimate can be approximated using the delta method; the leading term is
[Thomas et al., 2007]. By using this expression for the standard error, an inverse‐variance weighted (IVW) estimate of the causal effect combines the ratio estimates using each variant in a fixed‐effect meta‐analysis model:
(1)
(2)Likelihood‐Based Method
(3)
and
obtained from a single source can be specified as the observational correlation between the risk factor and the outcome. If the estimates
and
are derived from independent sources, this correlation will be zero. If
is the per allele change in the log‐odds or the log‐probability of an outcome, then β represents a log odds ratio or a log relative risk parameter, respectively.
Independence of Information on Causal Effect From Multiple Variants
By combining the estimates of association from multiple variants into a single estimate of the causal effect, an assumption is made that the variants provide independent information. There are several reasons why this may not be the case. First, the causal estimates are derived from the same data, and so will not be entirely independent. However, correlation between the estimates should be low unless the sample size is particularly small. Secondly, there may be statistical interactions between variants in their associations with the risk factor (gene–gene interactions). Thirdly, the distribution of genetic variants may be correlated (linkage disequilibrium).
We perform simulation studies to assess the impact of the assumption that multiple genetic variants provide independent information on the causal estimate, in particular in the presence of gene–gene interactions and linkage disequilibrium. We compare the causal effect estimate (and its precision) obtained from summarized data to that obtained if individual‐level data on the variants, risk factor and outcome were available on the whole study population.
Simulation Study With Independently Distributed Variants
(4)
, and U represents negative (unmeasured) confounding between X and Y. The IVs
take values 0, 1, 2 representing the number of minor alleles in three independently distributed SNPs, each with a minor allele frequency of
. Nine sets of values were taken for the gene–gene interactions:
(no gene–gene interactions), and
(gene–gene interactions present).
Simulation Study With Correlated Variants
and
were made from a zero‐mean K‐dimensional multivariate normal distribution for each individual i. For each draw, if the kth component was positive, a variant allele was recorded for the kth variant. The draws represent the two haplotypes for each individual [Lunn et al., 2006]. In this way, the variance‐covariance matrix in the multivariate normal distribution determines the correlation between variants
, which take values 0, 1, 2. Data for three genetic variants were simulated using the following model:
(5)
corresponding to a mean squared correlation between variants (r2) of 0 (no linkage disequilibrium), 0.06, 0.13, 0.26 and 0.41.
Weak Instruments
A weak instrument is a variable that satisfies the IV assumptions, but does not explain a large proportion of variation in the risk factor, so that the statistical association between the risk factor and the IV in the dataset is ‘weak’ [Burgess et al., 2011b]. IV estimates using weak instruments are biased in the direction of the observational estimate, and the distribution of the IV estimate is poorly approximated by a normal distribution [Burgess and Thompson, 2011]. The magnitude of bias depends on the expected value of the F statistic in the regression of the risk factor on the IVs, with lower F statistics corresponding to greater bias. Bias with a single IV in moderately large datasets is typically negligible, but bias may be considerable when there are multiple IVs [Angrist and Pischke, 2009]. For this reason, we are especially interested in how the summarized data methods perform with weak instruments.
Implementation
Estimates of the causal effect using all the genetic variants are calculated using individual‐level data with the two‐stage least squares (2SLS) method [Baum et al., 2003], and using summarized data with the IVW (equations 1 and 2) and likelihood‐based (equation 3) methods. The first‐stage model in the 2SLS was taken as additive in the variants throughout, and as such the genetic model was misspecified when there were gene–gene interactions. Summarized associations were obtained by ordinary least squares (OLS) linear regression of the risk factor and outcome on each variant in separate regression models. The likelihood‐based analyses were performed in R (http://www.r‐project.org) using the optim command to directly maximize the likelihood.
An estimate of the correlation between the genetic associations with risk factor and outcome of
was used based on the approximate observational correlation between the risk factor and outcome. Estimates were not especially sensitive to moderate (±0.2) changes in this correlation. (A sensitivity analysis for this parameter is shown later for an applied example.)
In each scenario, results from 10,000 simulated datasets for the comparison of the individual‐level and summarized data methods are given. We present the mean and median estimates across simulations, the standard deviation (SD) of estimates, the mean standard error (SE), the coverage of the 95% confidence interval for the causal effect (the proportion of simulated datasets for which the 95% confidence interval included the true value of
), and the empirical power at a 5% significance level (the proportion of simulated datasets for which the 95% confidence interval excluded the null value of
). The Monte Carlo standard error (representing the variation in estimates due to the finite number of simulations) was approximately 0.001 for the mean estimate (0.004 for the final scenario with gene–gene interactions) and 0.2% for the coverage. In each set of simulations, the mean value of the F statistic in the regression of the risk factor on the IVs is given.
Results
Independently Distributed Variants
Results from the scenario with gene–gene interactions are given in Table 1. The individual‐level 2SLS and summarized IVW analyses gave similar mean and median estimates, which did not differ in the third decimal place. They showed slight bias in the direction of the observational estimate, consistent with that predicted by weak instrument bias. The likelihood‐based analyses showed less bias with mean estimates around or slightly above the true value of 0.2 and median estimates slightly below the true value. Departures from the true value were most marked in the final scenario, where the mean F statistic for the genetic variants is below the conventional threshold of 10, below which IVs are considered to be ‘weak’.
| α12 | α13 | α23 | Mean F | Method | Mean | Median | SD | Mean SE | Coverage | Power |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 47.3 | 2SLS | 0.196 | 0.192 | 0.085 | 0.085 | 94.8 | 65.2 |
| IVW | 0.196 | 0.192 | 0.085 | 0.078 | 92.6 | 70.1 | ||||
| Likelihood | 0.200 | 0.197 | 0.087 | 0.082 | 94.2 | 69.1 | ||||
| +0.08 | +0.1 | +0.12 | 126.9 | 2SLS | 0.199 | 0.197 | 0.052 | 0.051 | 95.0 | 98.3 |
| IVW | 0.199 | 0.197 | 0.052 | 0.047 | 92.6 | 98.6 | ||||
| Likelihood | 0.200 | 0.199 | 0.052 | 0.050 | 94.0 | 98.6 | ||||
| −0.08 | +0.1 | +0.12 | 88.1 | 2SLS | 0.198 | 0.197 | 0.061 | 0.062 | 95.1 | 78.1 |
| IVW | 0.198 | 0.197 | 0.061 | 0.057 | 93.0 | 81.6 | ||||
| Likelihood | 0.201 | 0.199 | 0.062 | 0.060 | 94.2 | 80.9 | ||||
| +0.08 | −0.1 | +0.12 | 74.4 | 2SLS | 0.198 | 0.196 | 0.068 | 0.067 | 95.0 | 86.6 |
| IVW | 0.198 | 0.196 | 0.068 | 0.062 | 92.9 | 88.9 | ||||
| Likelihood | 0.201 | 0.199 | 0.068 | 0.065 | 94.2 | 88.6 | ||||
| +0.08 | +0.1 | −0.12 | 59.6 | 2SLS | 0.197 | 0.194 | 0.075 | 0.075 | 94.8 | 92.4 |
| IVW | 0.197 | 0.194 | 0.075 | 0.069 | 92.8 | 93.6 | ||||
| Likelihood | 0.201 | 0.198 | 0.076 | 0.073 | 94.2 | 93.4 | ||||
| −0.08 | −0.1 | +0.12 | 45.8 | 2SLS | 0.197 | 0.193 | 0.085 | 0.086 | 95.3 | 28.9 |
| IVW | 0.197 | 0.193 | 0.085 | 0.079 | 93.3 | 37.9 | ||||
| Likelihood | 0.202 | 0.197 | 0.087 | 0.084 | 94.6 | 35.9 | ||||
| −0.08 | +0.1 | −0.12 | 33.1 | 2SLS | 0.196 | 0.191 | 0.102 | 0.102 | 94.9 | 46.7 |
| IVW | 0.196 | 0.191 | 0.102 | 0.093 | 92.7 | 53.8 | ||||
| Likelihood | 0.203 | 0.197 | 0.105 | 0.100 | 94.4 | 52.2 | ||||
| +0.08 | −0.1 | −0.12 | 23.0 | 2SLS | 0.190 | 0.183 | 0.123 | 0.124 | 94.7 | 64.3 |
| IVW | 0.190 | 0.183 | 0.123 | 0.113 | 92.8 | 69.7 | ||||
| Likelihood | 0.201 | 0.192 | 0.129 | 0.121 | 94.2 | 68.7 | ||||
| −0.08 | −0.1 | −0.12 | 6.6 | 2SLS | 0.172 | 0.148 | 0.249 | 0.244 | 93.4 | 2.8 |
| IVW | 0.172 | 0.148 | 0.249 | 0.217 | 92.1 | 10.9 | ||||
| Likelihood | 0.221 | 0.180 | 0.357 | 0.285 | 94.6 | 5.7 |
- Instrumental variable estimates of causal effect +0.2 from simulated data with and without gene–gene interactions using individual‐level data (two‐stage least squares method, 2SLS) and summarized data (inverse‐variance weighted, IVW, and likelihood‐based methods) with mean F statistic, mean and median estimates across 10,000 simulations, SD of estimates, mean SE of estimates, coverage (%) of 95% confidence interval, and power (%) at a 5% significance level
The coverage was around 95% for the 2SLS and likelihood‐based methods, although coverage was slightly underestimated by the 2SLS method in the weak instrument scenario, and was marginally underestimated (average of 94.3%) by the likelihood‐based method throughout. Coverage for the IVW method was consistently underestimated at around 93%, indicating that the method gave estimates that were slightly too precise (the mean SE was less than the SD of the estimates). Estimates from the likelihood‐based and 2SLS methods had similar efficiency, with the 2SLS analyses giving slightly less variable estimates (lower SD), but the likelihood‐based analyses giving slightly more precise estimates (lower mean standard error). The IVW method had the greatest empirical power, although this was offset by the coverage levels not achieving nominal levels. Power from the likelihood‐based method was marginally lower, and from the 2SLS lower still.
Overall, despite gene–gene interactions leading to misspecification of the genetic model in the 2SLS method and effect modification in the genetic associations in the summarized data methods, the assumption of independence of the information provided by uncorrelated variants did not seem to give misleading results.
Correlated Variants
Results from the scenarios with variants in linkage disequilibrium are given in Table 2. Estimates from the individual‐level and summarized data methods were close to unbiased. The coverage of estimates from the 2SLS method was close to the nominal 95% level; however, the standard errors from the summarized data methods were too small and coverage was well below 95% when variants were correlated, even when the correlation was not large. Power was not reported in this case as it is misleading when the coverage is not close to the nominal levels. This shows that variants used in a summarized analysis must be uncorrelated in order to obtain valid statistical inferences.
| r2 | Mean F | Method | Mean | Median | SD | Mean SE | Coverage |
|---|---|---|---|---|---|---|---|
| 0.00 | 42.6 | 2SLS | 0.195 | 0.191 | 0.090 | 0.090 | 94.8 |
| IVW | 0.195 | 0.191 | 0.090 | 0.082 | 92.8 | ||
| Likelihood | 0.200 | 0.196 | 0.092 | 0.087 | 94.1 | ||
| 0.06 | 47.8 | 2SLS | 0.196 | 0.193 | 0.086 | 0.085 | 94.5 |
| IVW | 0.197 | 0.194 | 0.086 | 0.073 | 90.3 | ||
| Likelihood | 0.201 | 0.198 | 0.087 | 0.077 | 92.0 | ||
| 0.13 | 52.6 | 2SLS | 0.197 | 0.194 | 0.080 | 0.081 | 95.0 |
| IVW | 0.199 | 0.196 | 0.080 | 0.066 | 89.5 | ||
| Likelihood | 0.202 | 0.199 | 0.081 | 0.070 | 91.2 | ||
| 0.26 | 63.3 | 2SLS | 0.197 | 0.193 | 0.074 | 0.073 | 94.6 |
| IVW | 0.199 | 0.196 | 0.074 | 0.055 | 85.1 | ||
| Likelihood | 0.201 | 0.198 | 0.074 | 0.058 | 87.1 | ||
| 0.41 | 74.8 | 2SLS | 0.198 | 0.196 | 0.067 | 0.067 | 95.0 |
| IVW | 0.201 | 0.199 | 0.068 | 0.046 | 82.1 | ||
| Likelihood | 0.202 | 0.200 | 0.068 | 0.048 | 84.2 |
- Instrumental variable estimates of causal effect +0.2 from simulated data with correlated variants (correlation measured by r2, the average squared correlation between variants) using individual‐level data (two‐stage least squares method, 2SLS) and summarized data (inverse‐variance weighted, IVW, and likelihood‐based methods) with mean F statistic, mean and median estimates across simulations, SD of estimates, mean SE of estimates, coverage (%) of 95% confidence interval
Weak Instruments
We repeated the simulation in the absence of gene–gene interactions and linkage disequilibrium, but using 20 genetic variants with smaller effects on the risk factor to investigate the performance of the methods with weak instruments. Our simulations (see Supporting Information) suggest that the IVW and likelihood‐based methods have similar behavior with weak instruments to the 2SLS method. In particular, this means that when the expected value of the F statistic is greater than 10, the bias of the causal estimate is less than 10% of the bias of the confounded observational estimate from an OLS regression analysis [Staiger and Stock, 1997].
Example: Causal Effect of Low‐Density Lipoprotein Cholesterol on Coronary Artery Disease
Low‐density lipoprotein cholesterol (LDL‐C) is a known causal risk factor for coronary artery disease (CAD). Genetic variants associated with LDL‐C have been found in many different regions of the human genome. We take a published study reporting genetic associations of variants with LDL‐C, high‐density lipoprotein cholesterol (HDL‐C), and triglycerides (TG), and with CAD risk from a meta‐analysis of GWA studies [Waterworth et al., 2010]. We consider five genetic variants associated with LDL‐C (
), but not associated with HDL‐C nor TG (
) to mitigate against potential pleiotropy. These variants are on chromosome 1 (PCSK9 and SORT1 gene regions), chromosome 2 (APOB), chromosome 5 (HMGCR), and chromosome 19 (LDLR); details of the variants are given in the Supporting Information. The variants are not in linkage disequilibrium. A P‐value of
corresponds to an F statistic of around 20, so weak instrument bias is negligible. The estimated associations with 95% confidence intervals are shown graphically in Figure 1, together with an estimate of the causal effect of LDL‐C on CAD risk using the likelihood‐based method for summarized data assuming no correlation between the estimates of genetic association with the risk factor and the outcome. Odds ratio estimates for a 30% reduction in LDL‐C levels using the IVW method and the likelihood‐based method for a range of different correlation values are displayed in Table 3.

| Method | Correlation (ρ) | Estimate | 95% confidence interval |
|---|---|---|---|
| IVW | – | 0.33 | 0.25, 0.45 |
| Likelihood‐based | 0 | 0.33 | 0.24, 0.46 |
| Likelihood‐based | −0.4 | 0.33 | 0.23, 0.48 |
| Likelihood‐based | −0.2 | 0.33 | 0.24, 0.47 |
| Likelihood‐based | −0.1 | 0.33 | 0.24, 0.46 |
| Likelihood‐based | 0.1 | 0.33 | 0.25, 0.45 |
| Likelihood‐based | 0.2 | 0.33 | 0.25, 0.45 |
| Likelihood‐based | 0.4 | 0.34 | 0.26, 0.44 |
- Instrumental variable estimates of causal effect of low‐density lipoprotein cholesterol (LDL‐C) on risk of coronary artery disease (CAD) using inverse‐variance weighted (IVW) method and likelihood‐based method for different values of the correlation parameter (ρ)
We see that the estimates from both summarized data methods are similar, and that changing the correlation parameter in the likelihood‐based method has little impact. The graph indicates that variants with a greater magnitude of association with LDL‐C also have a greater association with CAD risk. The overall estimate of causal association passes close to the estimate from each of the variants, giving plausibility to the instrumental variable assumptions, and suggesting that changes in LDL‐C from different biological mechanisms may have similar effects on CAD risk [Burgess et al., 2012]. The confidence interval from the IVW method was slightly narrower than that from the likelihood‐based method with
, consistent with the slightly reduced coverage seen in the simulation studies.
Discussion
In this paper, we have considered methods for Mendelian randomization using summarized data on multiple genetic variants. The target for estimation is a non‐genetic parameter, the causal effect of the risk factor on the outcome. Each variant provides additional information on this parameter, and so the most precise estimate can be obtained using all available variants that are valid instrumental variables [Burgess et al., 2011aa]. GWA studies are promising resources for powerful Mendelian randomization investigations; however obtaining individual‐level data on large numbers of participants is often problematic for reasons of logistics and confidentiality. Our methods for obtaining causal estimates from summarized data commonly presented in published reports facilitate causal assessment of risk factors in existing consortia without additional data collection or sharing. Simulation results suggest that causal estimates obtained from summarized data using a likelihood‐based model with independently distributed ‘non‐weak’ variants are almost as precise as those obtained from individual‐level data, with bias close to zero and coverage close to nominal levels. The empirical power of estimates from the likelihood‐based method was greater than that from the 2SLS method. An alternative approach, using an inverse‐variance weighted method, gives similar point estimates to an individual‐level data analysis and slightly improved power over the likelihood‐based method, but slightly too narrow confidence intervals.
Comparison of Summarized Data Methods
The IVW and likelihood‐based methods make different assumptions about the distribution of variables. The IVW method uses an asymptotic estimate of the standard error of the causal (ratio) estimate from each variant; this is known to underestimate the true variation in the estimate, especially when the IV is weak [Burgess and Thompson, 2012]. No allowance is made in the method for uncertainty in the genetic association with the risk factor, although this could be incorporated by including additional terms from the delta method. The likelihood‐based method assumes a bivariate normal distribution for the genetic associations with the risk factor and outcome. In both summarized data methods, the variances of the association estimates are assumed to be known; this may be why coverage is consistently slightly underestimated. The likelihood‐based method is more computationally complex, but allows for correlation between the genetic association estimates with the risk factor and outcome, which is ignored in the IVW method. The likelihood‐based method also has a natural extension to a meta‐analysis framework using a hierarchical model (Section 5.6), which may better account for heterogeneity between studies than using meta‐analysed genetic association estimates directly.
Weak Instruments
With weak instruments, estimates using both of the summarized data methods demonstrated bias similar to that of the 2SLS method. If the distributions of genetic variants are uncorrelated, then each variant explains independent variation in the risk factor. For variants of equal strength, the expected F statistic in the univariate linear regression of the risk factor on one of the variants is approximately the same as the expected F statistic in the multiple regression on all of the variants. An F statistic of 10 corresponds to a P‐value of around 0.001; an F statistic of 20 to a P‐value of around
; and an F statistic of 30 to a P‐value of around
, a threshold often used for assessing GWA significance. In the example presented, as the variants were independently distributed and each had a P‐value of below
, the F statistic for all of the variants is at least 20; in fact, as some variants had P‐values well below
, the F statistic would be greater. With a sample size of 10,000, an F statistic of 10 corresponds to a coefficient of determination (R2) for each variant of around 0.1%. When sample sizes and the strength of genetic variants are limited, it may be necessary to restrict the number of variants used in a Mendelian randomization analysis in order to mitigate against bias from weak instruments.
Close to unbiased estimates with weak instruments can be obtained from individual‐level data using the limited information maximum likelihood (LIML) method [Angrist and Pischke, 2009] (see Supporting Information).
Assessment of the IV Assumptions
A limitation of the use of Mendelian randomization is that the instrumental variable assumptions cannot be assessed without supplementary data. Although the IV assumptions can never be fully tested empirically, they can be assessed to some extent, for example by testing the association of the genetic variants with measured covariates to assess potential pleiotropy. This can be undertaken using individual‐level data. It is also possible with summarized data, as with HDL‐C and TG in the example, although genetic associations with a full range of covariates would not necessarily be measured or routinely reported by a GWA study. These associations could be checked in the literature; however, the assumption is necessary that the literature‐based estimates are valid for the population under investigation. Other assessments, such as addressing population stratification by the evaluation of genetic principal components, or testing for the attenuation of genetic associations with the outcome on adjustment for the risk factor [Glymour et al., 2012], require individual‐level data.
In addition, many of the parametric assumptions required by IV methods for effect estimation, such as linearity of genetic associations or of the risk factor–outcome association [Didelez and Sheehan, 2007], cannot be assessed in summarized data.
Risk Factor and Outcome Associations in Separate Samples
Not all GWA studies measure data on a large number of phenotypic variables, and so genetic associations with the risk factor and the outcome may not be available in the same sample. Estimates of the association of the genetic variants with the risk factor and the outcome may be obtained from independent sources. This is known as a two‐sample IV analysis [Pierce and Burgess, 2013]. A key assumption in this case is that the genetic associations with the risk factor and outcome are of the same magnitude in both sources.
Correlation between genetic associations with the risk factor and outcome arising from estimation of the coefficients in the same source is ignored by the inverse‐variance weighted method, but can be acknowledged in the likelihood‐based method. If the sources for the estimates are neither identical nor disjoint, but instead overlap, then a meta‐analysis approach would be recommended to correctly account for the structure in the data. In the absence of study‐specific estimates, we advocate the likelihood‐based method with a sensitivity analysis for the correlation parameter.
Linkage Disequilibrium
If two genetic variants are in complete linkage disequilibrium, then inclusion of both variants in an individual‐level analysis model would not lead to additional information. However, if variants are in partial linkage equilibrium, although the information provided by each variant is not independent, each variant does provide additional information on the causal effect. Summarized data methods using variants in linkage disequilibrium overstate precision.
Typically, GWA studies report the association of a single lead variant in a genetic region. If this variant is not the causal variant, or if there are multiple causal variants in the region, then additional information may be obtained by considering multiple variants per region. Information on such variants could be included correctly in a summarized analysis by considering conditional effects of variants on the risk factor and outcome by adjustment for the lead variant in a regression model [Yang et al., 2012]. We have not developed these methods as suitable data are not routinely reported in applied analyses.
Extension to Multiple Studies
is associated with an observed
mean change in the risk factor with standard error
, with
and
similarly defined for the outcome. Although the presentation of GWA results from multiple studies is common in published research, summarized genetic associations that have been meta‐analysed across studies are often reported rather than study‐specific associations. We present this model for investigators with access to study‐level summarized estimates in the hope that study‐specific results will be routinely reported in the future.
(6)
and the causal effect parameters
; fixed‐effect models (
) could also be assumed. The overall causal effect of X on Y is
. Such models can be estimated in a Bayesian framework with weakly informative priors on the heterogeneity parameters τ2. If a particular study does not provide an estimate
, the distribution of the parameter
can be estimated using the relevant random‐effects distribution as an implicit prior.
Winner's Curse
The winner's curse is the phenomenon that the association estimate of the variant with the strongest association from a GWA study tends to be overestimated [Göring et al., 2001]. Typically, GWA studies report the single variant from each gene region showing the strongest association with the trait of interest. If several variants have similar strength, the variant with the strongest observed association will not always truly have the strongest association with the risk factor. Data‐driven selection of the ‘lead’ variant results in bias in the causal estimate, as the association of the lead variant is typically over‐estimated [Burgess et al., 2011b]. This is an example of selection bias. This bias can be eliminated by choosing the lead variant for each genetic region from an independent data source.
Sample Code
Sample code for estimating causal effects from summarized data for multiple variants associated with a risk factor in a single study and in multiple studies for R and WinBUGS is provided in the Supporting Information.
Conclusion
If individual‐level data are available, these should be used directly to perform a Mendelian randomization analysis. However, if individual‐level data are not available, then valid statistical inference can still be obtained from summarized data on the associations of genetic variants with the risk factor and the outcome. On the basis of the simulations in this paper and the theoretical explanations for the differences in results, we recommend the likelihood‐based model for applied analysis of summarized data. However, analyses should be restricted to uncorrelated genetic variants (no linkage disequilibrium), and care should be taken when including large numbers of variants to assess potential weak instrument bias by examining the F statistic in the regression of the risk factor on the variants.
References
Citing Literature
Number of times cited according to CrossRef: 469
- Liwan Fu, Yue‐Qing Hu, Apply multiple genetic variants as instrumental variables—Response to “MTHFR C677T polymorphism and hypertension”, The Journal of Clinical Hypertension, 10.1111/jch.13807, 22, 2, (307-307), (2020).
- Yue-Miao Zhang, Xu-Jie Zhou, Su-Fang Shi, Li-Jun Liu, Ji-Cheng Lyu, Hong Zhang, Homocysteine and IgA nephropathy, Chinese Medical Journal, 10.1097/CM9.0000000000000613, 133, 3, (277-284), (2020).
- Eric A. W. Slob, Stephen Burgess, A comparison of robust Mendelian randomization methods using summary data, Genetic Epidemiology, 10.1002/gepi.22295, 44, 4, (313-329), (2020).
- Qing Cheng, Yi Yang, Xingjie Shi, Kar-Fu Yeung, Can Yang, Heng Peng, Jin Liu, MR-LDP: a two-sample Mendelian randomization for GWAS summary statistics accounting for linkage disequilibrium and horizontal pleiotropy, NAR Genomics and Bioinformatics, 10.1093/nargab/lqaa028, 2, 2, (2020).
- Hannah J. Jones, David Martin, Sarah J. Lewis, George Davey Smith, Michael C. O'Donovan, Michael J. Owen, James T. R. Walters, Stanley Zammit, A Mendelian randomization study of the causal association between anxiety phenotypes and schizophrenia, American Journal of Medical Genetics Part B: Neuropsychiatric Genetics, 10.1002/ajmg.b.32808, 183, 6, (360-369), (2020).
- Li Qian, Yajuan Fan, Fengjie Gao, Binbin Zhao, Bin Yan, Wei Wang, Jian Yang, Xiancang Ma, Genetically Determined Levels of Serum Metabolites and Risk of Neuroticism: A Mendelian Randomization Study, International Journal of Neuropsychopharmacology, 10.1093/ijnp/pyaa062, (2020).
- Verena Zuber, Johanna Maria Colijn, Caroline Klaver, Stephen Burgess, Selecting likely causal risk factors from high-throughput experiments using multivariable Mendelian randomization, Nature Communications, 10.1038/s41467-019-13870-3, 11, 1, (2020).
- Yuesong Pan, Yilong Wang, Yongjun Wang, Investigation of Causal Effect of Atrial Fibrillation on Alzheimer Disease: A Mendelian Randomization Study, Journal of the American Heart Association, 10.1161/JAHA.119.014889, 9, 2, (2020).
- Sonia Shah, Albert Henry, Carolina Roselli, Honghuang Lin, Garðar Sveinbjörnsson, Ghazaleh Fatemifar, Åsa K. Hedman, Jemma B. Wilk, Michael P. Morley, Mark D. Chaffin, Anna Helgadottir, Niek Verweij, Abbas Dehghan, Peter Almgren, Charlotte Andersson, Krishna G. Aragam, Johan Ärnlöv, Joshua D. Backman, Mary L. Biggs, Heather L. Bloom, Jeffrey Brandimarto, Michael R. Brown, Leonard Buckbinder, David J. Carey, Daniel I. Chasman, Xing Chen, Xu Chen, Jonathan Chung, William Chutkow, James P. Cook, Graciela E. Delgado, Spiros Denaxas, Alexander S. Doney, Marcus Dörr, Samuel C. Dudley, Michael E. Dunn, Gunnar Engström, Tõnu Esko, Stephan B. Felix, Chris Finan, Ian Ford, Mohsen Ghanbari, Sahar Ghasemi, Vilmantas Giedraitis, Franco Giulianini, John S. Gottdiener, Stefan Gross, Daníel F. Guðbjartsson, Rebecca Gutmann, Christopher M. Haggerty, Pim van der Harst, Craig L. Hyde, Erik Ingelsson, J. Wouter Jukema, Maryam Kavousi, Kay-Tee Khaw, Marcus E. Kleber, Lars Køber, Andrea Koekemoer, Claudia Langenberg, Lars Lind, Cecilia M. Lindgren, Barry London, Luca A. Lotta, Ruth C. Lovering, Jian’an Luan, Patrik Magnusson, Anubha Mahajan, Kenneth B. Margulies, Winfried März, Olle Melander, Ify R. Mordi, Thomas Morgan, Andrew D. Morris, Andrew P. Morris, Alanna C. Morrison, Michael W. Nagle, Christopher P. Nelson, Alexander Niessner, Teemu Niiranen, Michelle L. O’Donoghue, Anjali T. Owens, Colin N. A. Palmer, Helen M. Parry, Markus Perola, Eliana Portilla-Fernandez, Bruce M. Psaty, Kenneth M. Rice, Paul M. Ridker, Simon P. R. Romaine, Jerome I. Rotter, Perttu Salo, Veikko Salomaa, Jessica van Setten, Alaa A. Shalaby, Diane T. Smelser, Nicholas L. Smith, Steen Stender, David J. Stott, Per Svensson, Mari-Liis Tammesoo, Kent D. Taylor, Maris Teder-Laving, Alexander Teumer, Guðmundur Thorgeirsson, Unnur Thorsteinsdottir, Christian Torp-Pedersen, Stella Trompet, Benoit Tyl, Andre G. Uitterlinden, Abirami Veluchamy, Uwe Völker, Adriaan A. Voors, Xiaosong Wang, Nicholas J. Wareham, Dawn Waterworth, Peter E. Weeke, Raul Weiss, Kerri L. Wiggins, Heming Xing, Laura M. Yerges-Armstrong, Bing Yu, Faiez Zannad, Jing Hua Zhao, Harry Hemingway, Nilesh J. Samani, John J. V. McMurray, Jian Yang, Peter M. Visscher, Christopher Newton-Cheh, Anders Malarstig, Hilma Holm, Steven A. Lubitz, Naveed Sattar, Michael V. Holmes, Thomas P. Cappola, Folkert W. Asselbergs, Aroon D. Hingorani, Karoline Kuchenbaecker, Patrick T. Ellinor, Chim C. Lang, Kari Stefansson, J. Gustav Smith, Ramachandran S. Vasan, Daniel I. Swerdlow, R. Thomas Lumbers, Genome-wide association and Mendelian randomisation analysis provide insights into the pathogenesis of heart failure, Nature Communications, 10.1038/s41467-019-13690-5, 11, 1, (2020).
- Stephen Burgess, Christopher N Foley, Elias Allara, James R Staley, Joanna M. M. Howson, A robust and efficient method for Mendelian randomization with hundreds of genetic variants, Nature Communications, 10.1038/s41467-019-14156-4, 11, 1, (2020).
- Jonathan Sulc, Thomas W. Winkler, Iris M. Heid, Zoltán Kutalik, Heterogeneity in Obesity: Genetic Basis and Metabolic Consequences, Current Diabetes Reports, 10.1007/s11892-020-1285-4, 20, 1, (2020).
- Chen Li, Svetlana Stoma, Luca A. Lotta, Sophie Warner, Eva Albrecht, Alessandra Allione, Pascal P. Arp, Linda Broer, Jessica L. Buxton, Alexessander Da Silva Couto Alves, Joris Deelen, Iryna O. Fedko, Scott D. Gordon, Tao Jiang, Robert Karlsson, Nicola Kerrison, Taylor K. Loe, Massimo Mangino, Yuri Milaneschi, Benjamin Miraglio, Natalia Pervjakova, Alessia Russo, Ida Surakka, Ashley van der Spek, Josine E. Verhoeven, Najaf Amin, Marian Beekman, Alexandra I. Blakemore, Federico Canzian, Stephen E. Hamby, Jouke-Jan Hottenga, Peter D. Jones, Pekka Jousilahti, Reedik Mägi, Sarah E. Medland, Grant W. Montgomery, Dale R. Nyholt, Markus Perola, Kirsi H. Pietiläinen, Veikko Salomaa, Elina Sillanpää, H. Eka Suchiman, Diana van Heemst, Gonneke Willemsen, Antonio Agudo, Heiner Boeing, Dorret I. Boomsma, Maria-Dolores Chirlaque, Guy Fagherazzi, Pietro Ferrari, Paul Franks, Christian Gieger, Johan Gunnar Eriksson, Marc Gunter, Sara Hägg, Iiris Hovatta, Liher Imaz, Jaakko Kaprio, Rudolf Kaaks, Timothy Key, Vittorio Krogh, Nicholas G. Martin, Olle Melander, Andres Metspalu, Concha Moreno, N. Charlotte Onland-Moret, Peter Nilsson, Ken K. Ong, Kim Overvad, Domenico Palli, Salvatore Panico, Nancy L. Pedersen, Brenda W.J. H. Penninx, J. Ramón Quirós, Marjo Riitta Jarvelin, Miguel Rodríguez-Barranco, Robert A. Scott, Gianluca Severi, P. Eline Slagboom, Tim D. Spector, Anne Tjonneland, Antonia Trichopoulou, Rosario Tumino, André G. Uitterlinden, Yvonne T. van der Schouw, Cornelia M. van Duijn, Elisabete Weiderpass, Eros Lazzerini Denchi, Giuseppe Matullo, Adam S. Butterworth, John Danesh, Nilesh J. Samani, Nicholas J. Wareham, Christopher P. Nelson, Claudia Langenberg, Veryan Codd, Genome-Wide Association Analysis in Humans Links Nucleotide Metabolism to Leukocyte Telomere Length, The American Journal of Human Genetics, 10.1016/j.ajhg.2020.02.006, (2020).
- Hannah J. Jones, Stanley Zammit, James T.R. Walters, Genetic studies of psychosis, Risk Factors for Psychosis, 10.1016/B978-0-12-813201-2.00010-7, (183-209), (2020).
- Fernando P. Hartwig, George Davey Smith, Amand F. Schmidt, Jonathan A. C. Sterne, Julian P. T. Higgins, Jack Bowden, The median and the mode as robust meta‐analysis estimators in the presence of small‐study effects and outliers, Research Synthesis Methods, 10.1002/jrsm.1402, 11, 3, (397-412), (2020).
- Jie Zheng, Marie‐Jo Brion, John P Kemp, Nicole M Warrington, Maria‐Carolina Borges, Gibran Hemani, Tom G Richardson, Humaira Rasheed, Zhen Qiao, Philip Haycock, Mika Ala‐Korpela, George Davey Smith, Jon H Tobias, David M Evans, The Effect of Plasma Lipids and Lipid‐Lowering Interventions on Bone Mineral Density: A Mendelian Randomization Study, Journal of Bone and Mineral Research, 10.1002/jbmr.3989, 35, 7, (1224-1235), (2020).
- Zhipeng Liu, Yang Zhang, Sarah Graham, Xiaokun Wang, Defeng Cai, Menghao Huang, Roger Pique-Regi, Xiaocheng Charlie Dong, Y. Eugene Chen, Cristen Willer, Wanqing Liu, Causal relationships between NAFLD, T2D and obesity have implications for disease subphenotyping, Journal of Hepatology, 10.1016/j.jhep.2020.03.006, (2020).
- Christopher P Nelson, Veryan Codd, Genetic determinants of telomere length and cancer risk, Current Opinion in Genetics & Development, 10.1016/j.gde.2020.02.007, 60, (63-68), (2020).
- Konstance Nicolopoulos, Anwar Mulugeta, Ang Zhou, Elina Hyppönen, Association between habitual coffee consumption and multiple disease outcomes: A Mendelian randomisation phenome-wide association study in the UK Biobank, Clinical Nutrition, 10.1016/j.clnu.2020.03.009, (2020).
- Lun-Hsien Chang, Jue-Sheng Ong, Jiyuan An, Karin J.H. Verweij, Jacqueline M. Vink, Joëlle Pasman, Mengzhen Liu, Stuart MacGregor, Marilyn C. Cornelis, Nicholas G. Martin, Eske M. Derks, Investigating the genetic and causal relationship between initiation or use of alcohol, caffeine, cannabis and nicotine, Drug and Alcohol Dependence, 10.1016/j.drugalcdep.2020.107966, (107966), (2020).
- Frida Emanuelsson, Sarah Marott, Anne Tybjærg-Hansen, Børge G. Nordestgaard, Marianne Benn, Impact of Glucose Level on Micro- and Macrovascular Disease in the General Population: A Mendelian Randomization Study, Diabetes Care, 10.2337/dc19-1850, 43, 4, (894-902), (2020).
- Anthony DiGiovanni, Kathryn Demanelis, Lin Tong, Maria Argos, Justin Shinkle, Farzana Jasmine, Mekala Sabarinathan, Muhammad Rakibuz-Zaman, Golam Sarwar, Md. Tariqul Islam, Hasan Shahriar, Tariqul Islam, Mahfuzar Rahman, Md. Yunus, Joseph Graziano, Mary V. Gamble, Habibul Ahsan, Brandon L. Pierce, Assessing the impact of arsenic metabolism efficiency on DNA methylation using Mendelian randomization, Environmental Epidemiology, 10.1097/EE9.0000000000000083, 4, 2, (e083), (2020).
- Young Ho Lee, Gwan Gyu Song, The Uric Acid and Gout have No Direct Causality With Osteoarthritis: A Mendelian Randomization Study, Journal of Rheumatic Diseases, 10.4078/jrd.2020.27.2.88, 27, 2, (88), (2020).
- Charleen D. Adams, A multivariable Mendelian randomization to appraise the pleiotropy between intelligence, education, and bipolar disorder in relation to schizophrenia, Scientific Reports, 10.1038/s41598-020-63104-6, 10, 1, (2020).
- Yalin Zhao, Yuping Xu, Xiaomeng Wang, Lin Xu, Jianhua Chen, Chengwen Gao, Chuanhong Wu, Dun Pan, Qian Zhang, Juan Zhou, Ruirui Chen, Zhuo Wang, Han Zhao, Li You, Yunxia Cao, Zhiqiang Li, Yongyong Shi, Body Mass Index and Polycystic Ovary Syndrome: A 2-Sample Bidirectional Mendelian Randomization Study, The Journal of Clinical Endocrinology & Metabolism, 10.1210/clinem/dgaa125, 105, 6, (2020).
- Sharon M. Lutz, John E. Hokanson, The Use of Mendelian Randomization to Determine the Role of Metabolic Traits on Urinary Albumin-to-Creatinine Ratio, Diabetes, 10.2337/dbi19-0034, 69, 5, (862-863), (2020).
- Harvinder Gala, Ian Tomlinson, The use of Mendelian randomisation to identify causal cancer risk factors: promise and limitations, The Journal of Pathology, 10.1002/path.5421, 250, 5, (541-554), (2020).
- Weiqi Chen, Shukun Wang, Wei Lv, Yuesong Pan, Causal associations of insulin resistance with coronary artery disease and ischemic stroke: a Mendelian randomization analysis, BMJ Open Diabetes Research & Care, 10.1136/bmjdrc-2020-001217, 8, 1, (e001217), (2020).
- Zhi Yu, Josef Coresh, Guanghao Qi, Morgan Grams, Eric Boerwinkle, Harold Snieder, Alexander Teumer, Cristian Pattaro, Anna Köttgen, Nilanjan Chatterjee, Adrienne Tin, A bidirectional Mendelian randomization study supports causal effects of kidney function on blood pressure., Kidney International, 10.1016/j.kint.2020.04.044, (2020).
- Hyoungnae Kim, Suyeon Park, Soon Hyo Kwon, Jin Seok Jeon, Dong Cheol Han, Hyunjin Noh, Impaired fasting glucose and development of chronic kidney disease in non-diabetic population: a Mendelian randomization study, BMJ Open Diabetes Research & Care, 10.1136/bmjdrc-2020-001395, 8, 1, (e001395), (2020).
- Benjamin Adams, Lauren Jacocks, Hui Guo, Higher BMI is linked to an increased risk of heart attacks in European adults: a Mendelian randomisation study, BMC Cardiovascular Disorders, 10.1186/s12872-020-01542-w, 20, 1, (2020).
- Yuan-De Tan, Peng Xiao, Chittibabu Guda, In-depth Mendelian randomization analysis of causal factors for coronary artery disease, Scientific Reports, 10.1038/s41598-020-66027-4, 10, 1, (2020).
- Jingjing Zhu, Xia Jiang, Zheng Niu, Alcohol consumption and risk of breast and ovarian cancer: a Mendelian randomization study, Cancer Genetics, 10.1016/j.cancergen.2020.06.001, (2020).
- Qi Jiang, Kexin Wang, Jiaojiao Shi, Mingfang Li, Minglong Chen, No association between alcohol consumption and risk of atrial fibrillation: A two-sample Mendelian randomization study, Nutrition, Metabolism and Cardiovascular Diseases, 10.1016/j.numecd.2020.04.014, (2020).
- André F. S. Amaral, Medea Imboden, Matthias Wielscher, Faisal I. Rezwan, Cosetta Minelli, Judith Garcia-Aymerich, Gabriela P. Peralta, Juha Auvinen, Ayoung Jeong, Emmanuel Schaffner, Anna Beckmeyer-Borowko, John W. Holloway, Marjo-Riitta Jarvelin, Nicole M. Probst-Hensch, Deborah L. Jarvis, Role of DNA methylation in the association of lung function with body mass index: a two-step epigenetic Mendelian randomisation study, BMC Pulmonary Medicine, 10.1186/s12890-020-01212-9, 20, 1, (2020).
- Gloria Hoi-Yee Li, Grace Mengqin Ge, Ching-Lung Cheung, Patrick Ip, David Coghill, Ian Chi-Kei Wong, Evaluation of causality between ADHD and Parkinson's disease: Mendelian randomization study, European Neuropsychopharmacology, 10.1016/j.euroneuro.2020.06.001, (2020).
- Shitao Rao, Alexandria Lau, Hon-Cheong So, Exploring Diseases/Traits and Blood Proteins Causally Related to Expression of ACE2, the Putative Receptor of SARS-CoV-2: A Mendelian Randomization Analysis Highlights Tentative Relevance of Diabetes-Related Traits, Diabetes Care, 10.2337/dc20-0643, 43, 7, (1416-1426), (2020).
- Linda Kachuri, Mattias Johansson, Sara R. Rashkin, Rebecca E. Graff, Yohan Bossé, Venkata Manem, Neil E. Caporaso, Maria Teresa Landi, David C. Christiani, Paolo Vineis, Geoffrey Liu, Ghislaine Scelo, David Zaridze, Sanjay S. Shete, Demetrius Albanes, Melinda C. Aldrich, Adonina Tardón, Gad Rennert, Chu Chen, Gary E. Goodman, Jennifer A. Doherty, Heike Bickeböller, John K. Field, Michael P. Davies, M. Dawn Teare, Lambertus A. Kiemeney, Stig E. Bojesen, Aage Haugen, Shanbeh Zienolddiny, Stephen Lam, Loïc Le Marchand, Iona Cheng, Matthew B. Schabath, Eric J. Duell, Angeline S. Andrew, Jonas Manjer, Philip Lazarus, Susanne Arnold, James D. McKay, Nima C. Emami, Matthew T. Warkentin, Yonathan Brhane, Ma’en Obeidat, Richard M. Martin, Caroline Relton, George Davey Smith, Philip C. Haycock, Christopher I. Amos, Paul Brennan, John S. Witte, Rayjean J. Hung, Immune-mediated genetic pathways resulting in pulmonary function impairment increase lung cancer susceptibility, Nature Communications, 10.1038/s41467-019-13855-2, 11, 1, (2020).
- Anne Ndungu, Anthony Payne, Jason M. Torres, Martijn van de Bunt, Mark I. McCarthy, A Multi-tissue Transcriptome Analysis of Human Metabolites Guides Interpretability of Associations Based on Multi-SNP Models for Gene Expression, The American Journal of Human Genetics, 10.1016/j.ajhg.2020.01.003, (2020).
- Despoina Manousaki, Ruth Mitchell, Tom Dudding, Simon Haworth, Adil Harroud, Vincenzo Forgetta, Rupal L. Shah, Jian’an Luan, Claudia Langenberg, Nicholas J. Timpson, J. Brent Richards, Genome-wide Association Study for Vitamin D Levels Reveals 69 Independent Loci, The American Journal of Human Genetics, 10.1016/j.ajhg.2020.01.017, (2020).
- Minna K. Karjalainen, Michael V. Holmes, Qin Wang, Olga Anufrieva, Mika Kähönen, Terho Lehtimäki, Aki S. Havulinna, Kati Kristiansson, Veikko Salomaa, Markus Perola, Jorma S. Viikari, Olli T. Raitakari, Marjo-Riitta Järvelin, Mika Ala-Korpela, Johannes Kettunen, Apolipoprotein A-I concentrations and risk of coronary artery disease: A Mendelian randomization study, Atherosclerosis, 10.1016/j.atherosclerosis.2020.02.002, (2020).
- Sandro Marini, Jordi Merino, Bailey E. Montgomery, Rainer Malik, Catherine L. Sudlow, Martin Dichgans, Jose C. Florez, Jonathan Rosand, Dipender Gill, Christopher D. Anderson, Mendelian Randomization Study of Obesity and Cerebrovascular Disease, Annals of Neurology, 10.1002/ana.25686, 87, 4, (516-524), (2020).
- Qinchang Chen, Lingling Li, Junzhe Yi, Kai Huang, Runnan Shen, Ridong Wu, Chen Yao, Waist circumference increases risk of coronary heart disease: Evidence from a Mendelian randomization study, Molecular Genetics & Genomic Medicine, 10.1002/mgg3.1186, 8, 4, (2020).
- Déborah De Masi, Juan M. Asensio, Pier‐Francesco Fazzini, Lise‐Marie Lacroix, Bruno Chaudret, Engineering Iron–Nickel Nanoparticles for Magnetically Induced CO2 Methanation in Continuous Flow, Angewandte Chemie, 10.1002/ange.201913865, 132, 15, (6246-6250), (2020).
- Yoonsu Cho, Philip C. Haycock, Eleanor Sanderson, Tom R. Gaunt, Jie Zheng, Andrew P. Morris, George Davey Smith, Gibran Hemani, Exploiting horizontal pleiotropy to search for causal pathways within a Mendelian randomization framework, Nature Communications, 10.1038/s41467-020-14452-4, 11, 1, (2020).
- Eirini Marouli, Aleksander Kus, Fabiola Del Greco M, Layal Chaker, Robin Peeters, Alexander Teumer, Panos Deloukas, Marco Medici, Thyroid Function Affects the Risk of Stroke via Atrial Fibrillation: A Mendelian Randomization Study, The Journal of Clinical Endocrinology & Metabolism, 10.1210/clinem/dgaa239, 105, 8, (2020).
- Seongmun Jeong, Jae-Yoon Kim, Youngbum Cho, Sang Baek Koh, Namshin Kim, Jung Ran Choi, Genetically, Dietary Sodium Intake Is Causally Associated with Salt-Sensitive Hypertension Risk in a Community-Based Cohort Study: a Mendelian Randomization Approach, Current Hypertension Reports, 10.1007/s11906-020-01050-4, 22, 7, (2020).
- Yixin Gao, Ting Wang, Xinghao Yu, Huashuo Zhao, Ping Zeng, Mendelian randomization implies no direct causal association between leukocyte telomere length and amyotrophic lateral sclerosis, Scientific Reports, 10.1038/s41598-020-68848-9, 10, 1, (2020).
- Charleen D. Adams, Brian B. Boutwell, A Mendelian randomization study of telomere length and blood-cell traits, Scientific Reports, 10.1038/s41598-020-68786-6, 10, 1, (2020).
- Marios K. Georgakis, Dipender Gill, Alastair J.S. Webb, Evangelos Evangelou, Paul Elliott, Cathie L.M. Sudlow, Abbas Dehghan, Rainer Malik, Ioanna Tzoulaki, Martin Dichgans, Genetically determined blood pressure, antihypertensive drug classes, and risk of stroke subtypes, Neurology, 10.1212/WNL.0000000000009814, 95, 4, (e353-e361), (2020).
- Yurong Cheng, Yong Li, Paula Benkowitz, Claudia Lamina, Anna Köttgen, Peggy Sekula, The relationship between blood metabolites of the tryptophan pathway and kidney function: a bidirectional Mendelian randomization analysis, Scientific Reports, 10.1038/s41598-020-69559-x, 10, 1, (2020).
- Zhongshang Yuan, Huanhuan Zhu, Ping Zeng, Sheng Yang, Shiquan Sun, Can Yang, Jin Liu, Xiang Zhou, Testing and controlling for horizontal pleiotropy with probabilistic Mendelian randomization in transcriptome-wide association studies, Nature Communications, 10.1038/s41467-020-17668-6, 11, 1, (2020).
- Xiaohui Sun, Ding Ye, Lingbin Du, Yu Qian, Xia Jiang, Yingying Mao, Genetically predicted levels of circulating cytokines and prostate cancer risk: A Mendelian randomization study, International Journal of Cancer, 10.1002/ijc.33221, 147, 9, (2469-2478), (2020).
- Haoxin Peng, Xiangrong Wu, Yaokai Wen, Caichen Li, Jingsheng Lin, Jianfu Li, Shan Xiong, Ran Zhong, Hengrui Liang, Bo Cheng, Jun Liu, Jianxing He, Wenhua Liang, Association between systemic sclerosis and risk of lung Cancer: Results from a Pool of cohort studies and Mendelian randomization analysis, Autoimmunity Reviews, 10.1016/j.autrev.2020.102633, (102633), (2020).
- Shuliu Sun, Minjie Jiao, Chengcheng Han, Qian Zhang, Wenhao Shi, Juanzi Shi, Xiaojuan Li, Causal Effects of Genetically Determined Metabolites on Risk of Polycystic Ovary Syndrome: A Mendelian Randomization Study, Frontiers in Endocrinology, 10.3389/fendo.2020.00621, 11, (2020).
- Yik Weng Yew, Marie Loh, Steven Tien Guan Thng, John C. Chambers, Investigating causal relationships between Body Mass Index and risk of atopic dermatitis: a Mendelian randomization analysis, Scientific Reports, 10.1038/s41598-020-72301-2, 10, 1, (2020).
- Charleen D. Adams, Brian B. Boutwell, Can increasing years of schooling reduce type 2 diabetes (T2D)?: Evidence from a Mendelian randomization of T2D and 10 of its risk factors, Scientific Reports, 10.1038/s41598-020-69114-8, 10, 1, (2020).
- Vivian K. Kawai, Mingjian Shi, Qiping Feng, Cecilia P. Chung, Ge Liu, Nancy J. Cox, Gail P. Jarvik, Ming T. M. Lee, Scott J. Hebbring, John B. Harley, Kenneth M. Kaufman, Bahram Namjou, Eric Larson, Adam S. Gordon, Dan M. Roden, C. Michael Stein, Jonathan D. Mosley, Pleiotropy in the Genetic Predisposition to Rheumatoid Arthritis: A Phenome‐Wide Association Study and Inverse Variance–Weighted Meta‐Analysis, Arthritis & Rheumatology, 10.1002/art.41291, 72, 9, (1483-1492), (2020).
- Hui-Min Liu, Qiang Zhang, Wen-Di Shen, Bo-Yang Li, Wan-Qiang Lv, Hong-Mei Xiao, Hong-Wen Deng, Sarcopenia-related traits and coronary artery disease: a bi-directional Mendelian randomization study, Aging, 10.18632/aging.102815, 12, 4, (3340-3353), (2020).
- Manuel Gentiluomo, Federico Canzian, Andrea Nicolini, Federica Gemignani, Stefano Landi, Daniele Campa, Germline genetic variability in pancreatic cancer risk and prognosis, Seminars in Cancer Biology, 10.1016/j.semcancer.2020.08.003, (2020).
- Jing-Jing Ni, Xiao-Lin Yang, Hong Zhang, Qian Xu, Xin-Tong Wei, Gui-Juan Feng, Min Zhao, Yu-Fang Pei, Lei Zhang, Assessing causal relationship from gut microbiota to heel bone mineral density, Bone, 10.1016/j.bone.2020.115652, (115652), (2020).
- Adriaan van der Graaf, Annique Claringbould, Antoine Rimbert, Harm-Jan Westra, Yang Li, Cisca Wijmenga, Serena Sanna, Mendelian randomization while jointly modeling cis genetics identifies causal relationships between gene expression and lipids, Nature Communications, 10.1038/s41467-020-18716-x, 11, 1, (2020).
- Steven Bell, Joel T. Gibson, Eric L. Harshfield, Hugh S. Markus, Is periodontitis a risk factor for ischaemic stroke, coronary artery disease and subclinical atherosclerosis? A Mendelian randomisation study, Atherosclerosis, 10.1016/j.atherosclerosis.2020.09.029, (2020).
- Shucheng Si, Marlvin Anemey Tewara, Yunxia Li, Wenchao Li, Xiaolu Chen, Tonghui Yuan, Congcong Liu, Jiqing Li, Bojie Wang, Hongkai Li, Lei Hou, Qing Wang, Fuzhong Xue, Causal Pathways from Body Components and Regional Fat to Extensive Metabolic Phenotypes: A Mendelian Randomization Study, Obesity, 10.1002/oby.22857, 28, 8, (1536-1549), (2020).
- Timothy E. Thayer, Rebecca T. Levinson, Shi Huang, Tufik Assad, Eric Farber-Eger, Quinn S. Wells, Jonathan D. Mosley, Evan L. Brittain, BMI Is Causally Associated With Pulmonary Artery Pressure but Not Hemodynamic Evidence of Pulmonary Vascular Remodeling, Chest, 10.1016/j.chest.2020.07.038, (2020).
- Young Ho Lee, Overview of Mendelian Randomization Analysis, Journal of Rheumatic Diseases, 10.4078/jrd.2020.27.4.241, 27, 4, (241-246), (2020).
- Pedrum Mohammadi-Shemirani, Jennifer Sjaarda, Hertzel C Gerstein, Darin J Treleaven, Michael Walsh, Johannes F Mann, Matthew J McQueen, Sibylle Hess, Guillaume Paré, A Mendelian Randomization-Based Approach to Identify Early and Sensitive Diagnostic Biomarkers of Disease, Clinical Chemistry, 10.1373/clinchem.2018.291104, 65, 3, (427-436), (2020).
- Stephen Burgess, George Davey Smith, Neil M. Davies, Frank Dudbridge, Dipender Gill, M. Maria Glymour, Fernando P. Hartwig, Michael V. Holmes, Cosetta Minelli, Caroline L. Relton, Evropi Theodoratou, Guidelines for performing Mendelian randomization investigations, Wellcome Open Research, 10.12688/wellcomeopenres.15555.2, 4, (186), (2020).
- Christina M Astley, Jennifer N Todd, Rany M Salem, Sailaja Vedantam, Cara B Ebbeling, Paul L Huang, David S Ludwig, Joel N Hirschhorn, Jose C Florez, Genetic Evidence That Carbohydrate-Stimulated Insulin Secretion Leads to Obesity, Clinical Chemistry, 10.1373/clinchem.2017.280727, 64, 1, (192-200), (2020).
- Christina Ellervik, Samia Mora, Paul Ridker, Daniel I Chasman, Hypothyroidism and kidney function – a Mendelian Randomization study, Thyroid, 10.1089/thy.2019.0167, (2020).
- Wes Spiller, Keum Ji Jung, Ji-Young Lee, Sun Ha Jee, Precision Medicine and Cardiovascular Health: Insights from Mendelian Randomization Analyses, Korean Circulation Journal, 10.4070/kcj.2019.0293, 50, (2020).
- Nikos Papadimitriou, Niki Dimou, Dipender Gill, Ioanna Tzoulaki, Neil Murphy, Elio Riboli, Sarah J. Lewis, Richard M. Martin, Marc J. Gunter, Konstantinos K. Tsilidis, Genetically predicted circulating concentrations of micronutrients and risk of breast cancer: A Mendelian randomization study, International Journal of Cancer, 10.1002/ijc.33246, 0, 0, (2020).
- Xikun Han, Jue-Sheng Ong, Jiyuan An, Alex W. Hewitt, Puya Gharahkhani, Stuart MacGregor, Using Mendelian randomization to evaluate the causal relationship between serum C-reactive protein levels and age-related macular degeneration, European Journal of Epidemiology, 10.1007/s10654-019-00598-z, (2020).
- Jinxi Lin, Yilong Wang, Yongjun Wang, Yuesong Pan, Inflammatory biomarkers and risk of ischemic stroke and subtypes: A 2-sample Mendelian randomization study, Neurological Research, 10.1080/01616412.2019.1710404, (1-8), (2020).
- Yu Jiang, Zixuan Su, Caichen Li, Runchen Wang, Yaokai Wen, Hengrui Liang, Jianxing He, Wenhua Liang, Association between the use of aspirin and risk of lung cancer: results from pooled cohorts and Mendelian randomization analyses, Journal of Cancer Research and Clinical Oncology, 10.1007/s00432-020-03394-5, (2020).
- Jian Yang, Bin Yan, Binbin Zhao, Yajuan Fan, Xiaoyan He, Lihong Yang, Qingyan Ma, Jie Zheng, Wei Wang, Ling Bai, Feng Zhu, Xiancang Ma, Assessing the Causal Effects of Human Serum Metabolites on 5 Major Psychiatric Disorders, Schizophrenia Bulletin, 10.1093/schbul/sbz138, (2020).
- Pinpin Long, Xuezhen Liu, Jun Li, Shiqi He, Huiting Chen, Yu Yuan, Gaokun Qiu, Kuai Yu, Kang Liu, Jing Jiang, Handong Yang, Chengwei Xu, Xiaomin Zhang, Meian He, Huan Guo, Liming Liang, Frank B Hu, Tangchun Wu, An Pan, Circulating folate concentrations and risk of coronary artery disease: a prospective cohort study in Chinese adults and a Mendelian randomization analysis, The American Journal of Clinical Nutrition, 10.1093/ajcn/nqz314, (2020).
- S.-C. Bae, Y. H. Lee, Causal association between periodontitis and risk of rheumatoid arthritis and systemic lupus erythematosus: a Mendelian randomizationKausalzusammenhang zwischen Periodontitis und dem Risiko für rheumatoide Arthritis und systemischen Lupus erythematodes: eine Mendel-Randomisierung, Zeitschrift für Rheumatologie, 10.1007/s00393-019-00742-w, (2020).
- Marios K Georgakis, Rainer Malik, Christopher D Anderson, Klaus G Parhofer, Jemma C Hopewell, Martin Dichgans, Genetic determinants of blood lipids and cerebral small vessel disease: role of high-density lipoprotein cholesterol, Brain, 10.1093/brain/awz413, (2020).
- Fangtang Yu, Chuan Qiu, Chao Xu, Qing Tian, Lan-Juan Zhao, Li Wu, Hong-Wen Deng, Hui Shen, Mendelian Randomization Identifies CpG Methylation Sites With Mediation Effects for Genetic Influences on BMD in Peripheral Blood Monocytes, Frontiers in Genetics, 10.3389/fgene.2020.00060, 11, (2020).
- Richard Howey, So-Youn Shin, Caroline Relton, George Davey Smith, Heather J. Cordell, Bayesian network analysis incorporating genetic anchors complements conventional Mendelian randomization approaches for exploratory analysis of causal relationships in complex data, PLOS Genetics, 10.1371/journal.pgen.1008198, 16, 3, (e1008198), (2020).
- Mengyu Li, Man Ki Kwok, Shirley Siu Ming Fong, Catherine Mary Schooling, Effects of tryptophan, serotonin, and kynurenine on ischemic heart diseases and its risk factors: a Mendelian Randomization study, European Journal of Clinical Nutrition, 10.1038/s41430-020-0588-5, (2020).
- Qianqian Luo, Zheng Wen, Yuanfan Li, Zefeng Chen, Xinyang Long, Yulan Bai, Shengzhu Huang, Yunkun Yan, Rui Lin, Zengnan Mo,
Assessment Causality in Associations Between Serum Uric Acid and Risk of Schizophrenia: A Two-Sample Bidirectional Mendelian Randomization Study
, Clinical Epidemiology, 10.2147/CLEP.S236885, Volume 12, (223-233), (2020). - Marianne Benn, Børge G. Nordestgaard, Anne Tybjærg-Hansen, Ruth Frikke-Schmidt, Impact of glucose on risk of dementia: Mendelian randomisation studies in 115,875 individuals, Diabetologia, 10.1007/s00125-020-05124-5, (2020).
- Saori Sakaue, Masahiro Kanai, Juha Karjalainen, Masato Akiyama, Mitja Kurki, Nana Matoba, Atsushi Takahashi, Makoto Hirata, Michiaki Kubo, Koichi Matsuda, Yoshinori Murakami, Mark J. Daly, Yoichiro Kamatani, Yukinori Okada, Trans-biobank analysis with 676,000 individuals elucidates the association of polygenic risk scores of complex traits with human lifespan, Nature Medicine, 10.1038/s41591-020-0785-8, (2020).
- Emmanuel O. Adewuyi, Yadav Sapkota, Asa Auta, Kosuke Yoshihara, Mette Nyegaard, Lyn R. Griffiths, Grant W. Montgomery, Daniel I. Chasman, Dale R. Nyholt, Shared Molecular Genetic Mechanisms Underlie Endometriosis and Migraine Comorbidity, Genes, 10.3390/genes11030268, 11, 3, (268), (2020).
- Yinghao Yao, Yi Xu, Zhen Cai, Qiang Liu, Yunlong Ma, Andria N. Li, Thomas J. Payne, Ming D. Li, Determination of shared genetic etiology and possible causal relations between tobacco smoking and depression, Psychological Medicine, 10.1017/S003329172000063X, (1-10), (2020).
- Min Cao, Bin Cui, Association of Educational Attainment With Adiposity, Type 2 Diabetes, and Coronary Artery Diseases: A Mendelian Randomization Study, Frontiers in Public Health, 10.3389/fpubh.2020.00112, 8, (2020).
- Hon-Cheong So, Carlos Kwan-long Chau, Yu-ying Cheng, Pak C. Sham, Causal relationships between blood lipids and depression phenotypes: a Mendelian randomisation analysis, Psychological Medicine, 10.1017/S0033291720000951, (1-13), (2020).
- Teresa Fazia, Andrea Nova, Davide Gentilini, Ashley Beecham, Marialuisa Piras, Valeria Saddi, Anna Ticca, Pierpaolo Bitti, Jacob L. McCauley, Carlo Berzuini, Luisa Bernardinelli, Investigating the Causal Effect of Brain Expression of CCL2, NFKB1, MAPK14, TNFRSF1A, CXCL10 Genes on Multiple Sclerosis: A Two-Sample Mendelian Randomization Approach, Frontiers in Bioengineering and Biotechnology, 10.3389/fbioe.2020.00397, 8, (2020).
- Tom G Richardson, Eleanor Sanderson, Benjamin Elsworth, Kate Tilling, George Davey Smith, Use of genetic variation to separate the effects of early and later life adiposity on disease risk: mendelian randomisation study, BMJ, 10.1136/bmj.m1203, (m1203), (2020).
- Yan Zheng, Tao Huang, Tiange Wang, Zhendong Mei, Zhonghan Sun, Tao Zhang, Christina Ellervik, Jin-Fang Chai, Xueling Sim, Rob M. van Dam, E-Shyong Tai, Woon-Puay Koh, Rajkumar Dorajoo, Seang-Mei Saw, Charumathi Sabanayagam, Tien Yin Wong, Preeti Gupta, Peter Rossing, Tarunveer S. Ahluwalia, Rebecca K. Vinding, Hans Bisgaard, Klaus Bønnelykke, Yujie Wang, Mariaelisa Graff, Trudy Voortman, Frank J. A. van Rooij, Albert Hofman, Diana van Heemst, Raymond Noordam, Angela C. Estampador, Tibor V. Varga, Cornelia Enzenbach, Markus Scholz, Joachim Thiery, Ralph Burkhardt, Marju Orho-Melander, Christina-Alexandra Schulz, Ulrika Ericson, Emily Sonestedt, Michiaki Kubo, Masato Akiyama, Ang Zhou, Tuomas O. Kilpeläinen, Torben Hansen, Marcus E. Kleber, Graciela Delgado, Mark McCarthy, Rozenn N. Lemaitre, Janine F. Felix, Vincent W. V. Jaddoe, Ying Wu, Karen L. Mohlke, Terho Lehtimäki, Carol A. Wang, Craig E. Pennell, Heribert Schunkert, Thorsten Kessler, Lingyao Zeng, Christina Willenborg, Annette Peters, Wolfgang Lieb, Veit Grote, Peter Rzehak, Berthold Koletzko, Jeanette Erdmann, Matthias Munz, Tangchun Wu, Meian He, Caizheng Yu, Cécile Lecoeur, Philippe Froguel, Dolores Corella, Luis A. Moreno, Chao-Qiang Lai, Niina Pitkänen, Colin A. Boreham, Paul M. Ridker, Frits R. Rosendaal, Renée de Mutsert, Chris Power, Lavinia Paternoster, Thorkild I. A. Sørensen, Anne Tjønneland, Kim Overvad, Luc Djousse, Fernando Rivadeneira, Nanette R. Lee, Olli T. Raitakari, Mika Kähönen, Jorma Viikari, Jean-Paul Langhendries, Joaquin Escribano, Elvira Verduci, George Dedoussis, Inke König, Beverley Balkau, Oscar Coltell, Jean Dallongeville, Aline Meirhaeghe, Philippe Amouyel, Frédéric Gottrand, Katja Pahkala, Harri Niinikoski, Elina Hyppönen, Winfried März, David A. Mackey, Dariusz Gruszfeld, Katherine L. Tucker, Frédéric Fumeron, Ramon Estruch, Jose M. Ordovas, Donna K. Arnett, Dennis O. Mook-Kanamori, Dariush Mozaffarian, Bruce M. Psaty, Kari E. North, Daniel I. Chasman, Lu Qi, Mendelian randomization analysis does not support causal associations of birth weight with hypertension risk and blood pressure in adulthood, European Journal of Epidemiology, 10.1007/s10654-020-00638-z, (2020).
- Laurence J Howe, Frank Dudbridge, Amand F Schmidt, Chris Finan, Spiros Denaxas, Folkert W Asselbergs, Aroon D Hingorani, Riyaz S Patel, Polygenic risk scores for coronary artery disease and subsequent event risk amongst established cases, Human Molecular Genetics, 10.1093/hmg/ddaa052, (2020).
- Esther Molina-Montes, Claudia Coscia, Paulina Gómez-Rubio, Alba Fernández, Rianne Boenink, Marta Rava, Mirari Márquez, Xavier Molero, Matthias Löhr, Linda Sharp, Christoph W Michalski, Antoni Farré, José Perea, Michael O’Rorke, William Greenhalf, Mar Iglesias, Adonina Tardón, Thomas M Gress, Victor M Barberá, Tatjana Crnogorac-Jurcevic, Luis Muñoz-Bellvís, J Enrique Dominguez-Muñoz, Harald Renz, Joaquim Balcells, Eithne Costello, Lucas Ilzarbe, Jörg Kleeff, Bo Kong, Josefina Mora, Damian O’Driscoll, Ignasi Poves, Aldo Scarpa, Jingru Yu, Manuel Hidalgo, Rita T Lawlor, Weimin Ye, Alfredo Carrato, Francisco X Real, Núria Malats, Deciphering the complex interplay between pancreatic cancer, diabetes mellitus subtypes and obesity/BMI through causal inference and mediation analyses, Gut, 10.1136/gutjnl-2019-319990, (gutjnl-2019-319990), (2020).
- Jie V. Zhao, C. Mary Schooling, Sex-specific associations of insulin resistance with chronic kidney disease and kidney function: a bi-directional Mendelian randomisation study, Diabetologia, 10.1007/s00125-020-05163-y, (2020).
- Peter Kraft, Hongjie Chen, Sara Lindström, The Use of Genetic Correlation and Mendelian Randomization Studies to Increase Our Understanding of Relationships between Complex Traits, Current Epidemiology Reports, 10.1007/s40471-020-00233-6, (2020).
- Jean Morrison, Nicholas Knoblauch, Joseph H. Marcus, Matthew Stephens, Xin He, Mendelian randomization accounting for correlated and uncorrelated pleiotropic effects using genome-wide summary statistics, Nature Genetics, 10.1038/s41588-020-0631-4, (2020).
- Kai Xiang Lim, Frühling Rijsdijk, Saskia P. Hagenaars, Adam Socrates, Shing Wan Choi, Jonathan R. I. Coleman, Kylie P. Glanville, Cathryn M. Lewis, Jean-Baptiste Pingault, Studying individual risk factors for self-harm in the UK Biobank: A polygenic scoring and Mendelian randomisation study, PLOS Medicine, 10.1371/journal.pmed.1003137, 17, 6, (e1003137), (2020).
- Marijne Vandebergh, An Goris, Smoking and multiple sclerosis risk: a Mendelian randomization study, Journal of Neurology, 10.1007/s00415-020-09980-4, (2020).
- Rebecca B. Lawn, Hannah M. Sallis, Robyn E. Wootton, Amy E. Taylor, Perline Demange, Abigail Fraser, Ian S. Penton-Voak, Marcus R. Munafò, The effects of age at menarche and first sexual intercourse on reproductive and behavioural outcomes: A Mendelian randomization study, PLOS ONE, 10.1371/journal.pone.0234488, 15, 6, (e0234488), (2020).
- Jianhua Chen, Ruirui Chen, Siying Xiang, Ningning Li, Chengwen Gao, Chuanhong Wu, Qian Zhang, Yalin Zhao, Yanhui Liao, Robert Stewart, Yifeng Xu, Yongyong Shi, Zhiqiang Li, Cigarette smoking and schizophrenia: Mendelian randomisation study, The British Journal of Psychiatry, 10.1192/bjp.2020.116, (1-6), (2020).
- See more




