Systematically missing confounders in individual participant data meta-analysis of observational cohort studies

One difficulty in performing meta-analyses of observational cohort studies is that the availability of confounders may vary between cohorts, so that some cohorts provide fully adjusted analyses while others only provide partially adjusted analyses. Commonly, analyses of the association between an exposure and disease either are restricted to cohorts with full confounder information, or use all cohorts but do not fully adjust for confounding. We propose using a bivariate random-effects meta-analysis model to use information from all available cohorts while still adjusting for all the potential confounders. Our method uses both the fully adjusted and the partially adjusted estimated effects in the cohorts with full confounder information, together with an estimate of their within-cohort correlation. The method is applied to estimate the association between fibrinogen level and coronary heart disease incidence using data from 154 012 participants in 31 cohorts.† Copyright © 2009 John Wiley & Sons, Ltd.


INTRODUCTION
Results from observational studies, such as epidemiological cohort studies, are susceptible to the distorting influence of confounders. These variables, through their association with the outcome of interest, can result in misleading inferences for the effect of other covariates unless they are properly adjusted for. Although including potential but unimportant confounders results in a loss of precision, inappropriately excluding them can lead to erroneous conclusions, and hence it is 1220 THE FIBRINOGEN STUDIES COLLABORATION of IPD meta-analyses over their aggregate data counterparts, such as ensuring that all studies have the same variable definitions and the same analysis models, and enabling subgroup analyses [11].
The paper is set out as follows. In Section 2 we describe our motivating example, an IPD meta-analysis of 31 cohort studies relating plasma fibrinogen levels to time to coronary heart disease events [15]. Here only 14 cohorts provide information on all confounders, so 17 cohorts cannot provide fully adjusted estimates. In Section 3 our proposed model is described. In Section 4 procedures for estimating the within-study correlations are derived and in Section 5 the numerical implementation is discussed. Some illustrative analyses are performed in Section 6 and in Section 7 we return to the original data analysis and perform analyses more directly applicable to this. Section 8 summarizes our conclusions.

THE FIBRINOGEN DATA
We re-examine the database of our large collaborative IPD meta-analysis which explored the association between plasma fibrinogen and coronary heart disease in 31 cohort studies with 154 012 participants [15]. This was assessed using a proportional hazards (Cox) model, stratified by cohort, sex and (for the two cohorts that were randomized controlled trials) trial arm.
All 31 cohorts record whether or not coronary heart disease events occurred, and the times to event or censoring. They also provide details of every participant's fibrinogen level, age, smoking status, total cholesterol, systolic blood pressure and body mass index. Particular interest lies in the effect of participants' fibrinogen levels on coronary heart disease-free survival times, and the other covariates included in the models below represent potential confounders.
Only 14 cohorts give near complete data on participants' HDL cholesterol, LDL cholesterol, alcohol consumption, triglycerides and history of diabetes. A summary of the completeness of these additional covariates is provided in Table I. This table shows that cohorts generally have relatively few missing observations on variables that their designs intended to collect; the overwhelming majority of the missing observations are therefore systematically missing. There is however a single cohort that attempts to provide details of all five of the additional confounders but has much lower response rates for the cholesterol variables and triglycerides. In order to ensure consistency with our previous analysis [15], this particular cohort is treated as not providing details of these. A further issue is that total, HDL and LDL cholesterol levels are likely to be fairly collinear so only HDL and LDL cholesterol covariates, and not total cholesterol, were previously included in the full model [15]. Hence the covariates used in the partial models, using just the first set of covariates described above, were not quite a subset of those used in the full model.
We [15] previously performed two series of analyses: the first using information from all 31 cohorts, but adjusting only for covariates in the first set, and the second adjusting for covariates in both sets, but using just the information from the 14 cohorts that adequately record the necessary details. The intention here is to produce an analysis that takes into account all of the various potential confounders but also uses information from all 31 cohorts.

A BIVARIATE MODEL FOR MISSING CONFOUNDERS
In this section, a model is developed for the scenario where all cohorts provide the same subset of confounders and only some cohorts provide all of the confounders. It is also assumed that the

Modelling individual participants within cohorts
In a particular cohort, let X S denote the vector of an individual participants' stratifying covariates (sex and trial arm in our data), let X 1 denote the column vector of other covariates that are also observed by all cohorts (including the covariate of particular interest) and let X 2 denote the column vector of covariates that are only observed by some cohorts. For each cohort where X 2 is observed we assume the full proportional hazards model for the time to event, and we similarly assume for the partial model, i.e. without the covariates X 2 , that where denotes the hazard function. This notation emphasizes the difference in the parameters b 1 and the baseline hazard functions in the two models: in the full model these are denoted with a superscript f indicating quantities that are fully adjusted for (i.e. take into account all the covariates X 1 and X 2 ) while in the partial model the superscript p denotes quantities that are only partially adjusted, as they do not take into account the covariates X 2 . Both models apply to a particular cohort, and it is anticipated that cohorts will have different baseline hazards. Although both the full and partial models cannot simultaneously be true, unless b 2 = 0, both are likely to provide adequate descriptions of the data [16]. Note that bold font is used for row vectors of parameters in these models to distinguish between these and their first entries in the notation that follows.
We can obtain estimatesb f 1 ,b 2 (from the full model 1) andb p 1 (from the partial model 2) for each cohort that provides details of X 2 , by maximizing the partial likelihood in the usual way [17]. For those cohorts that do not provide details of X 2 , we can only obtain the corresponding estimate from model (2).
Let the first entry in X 1 denote the covariate of particular interest. We are therefore interested only in inference regarding the first parameter in the vectors b f 1 and b

Between-cohorts model
We assume for any given cohort that ⎛ where we assume that 2 1 , 2 2 and are fixed and known, a conventional assumption when using bivariate models in meta-analysis [5,6] and a generalization of assuming that the within-cohort variances are fixed and known in more usual univariate analyses [18,19]. In practice, however, these values must be estimated by standard methods: the variances are provided by the output of proportional hazards regression in standard statistical packages, and obtained from the observed information matrix [20, p. 41]. The difficulty lies in estimating , and some approaches for obtaining this are suggested in Section 4.
The underlying f 1 and p 1 may vary from cohort to cohort. We assume that this variation can be modelled as where we anticipate, but do not require, that both and will be positive. Equation (5) is simply the standard bivariate model for meta-analysis, where the two outcomes are the partially and fully adjusted effects. This is an innovative use of this standard model, as more usually the outcomes are not defined so similarly (typically they are notably different types of patient responses). Despite this, the central limit theorem implies that model (3) provides a good approximation for large cohorts such as these and, combining this with model (4), which describes the between-cohort variation, the bivariate random-effects model is a natural choice for data such as these.
In a standard univariate meta-analysis of the fully adjusted effect of fibrinogen level, using just the cohorts that provide the necessary information on all confounders, the marginal model forˆ f 1 from (5) is assumed and information from 17 cohorts is simply discarded. This comment applies to all the analyses performed using the bivariate model below.

The log-likelihood function of the fibrinogen data
For cohorts that do not provide X 2 , only the partial model (2) is fitted,ˆ f 1 is unobserved and, assuming that this is missing at random (MAR), the marginal distribution ofˆ p 1 from model (5) alone is required.
The resulting log-likelihood function of the data, i.e. the fully and partially adjusted estimates from the 14 cohorts that give full confounder information, and the partially adjusted estimates from the remaining 17 cohorts, obtained as described in Section 3.1, is where the bivariate and marginal densities, , are obtained directly from distributions (5), the first and second summations in (6) being over the cohorts that provide X 2 , and those that do not, respectively. This likelihood involves five parameters, but f is of primary interest. Although the cohorts that fail to report X 2 do not provide direct evidence relating to f , they provide indirect information via their partially adjusted estimates and their assumed association with the fully adjusted estimates. The bivariate random-effects model therefore allows inferences concerning the fully adjusted effect to borrow strength from cohorts where fully adjusted estimates are unavailable, as explained in the introduction. The bivariate model also enables us to examine the nature of the relationship between the two types of effects.
Missing estimates are not imputed by this procedure, but the relationship between the fully and partially adjusted estimates, for the cohorts where both estimates are available, is assumed to apply to those where only partially adjusted estimates can be obtained. Since a bivariate normal model is adopted this association is assumed to be linear, so that the method bears some similarities to the approach of Riley et al. [21], who impute missing estimates and standard errors from linear trends in the context of a sensitivity analysis.
Inferences for the partially adjusted p are also made when fitting the bivariate model, which makes use of the fully adjusted estimates, although this borrowing of strength is likely to be very limited, as all 31 partially adjusted estimates are available. Once the fully and partially adjusted estimates have been obtained, the methodology therefore becomes a fairly standard application of the bivariate random-effects model for meta-analysis, but with one very particular difficulty: the within-cohort correlations are assumed known but need to be estimated. Some novel approaches are therefore developed for this purpose in the next section.

ESTIMATING THE WITHIN-COHORT CORRELATION
Although values of are estimated for each cohort, once evaluated these are regarded as fixed and known. We therefore suppress the emphasis that is an estimate in the notation that follows.

A nonparametric bootstrap estimate, b
Nonparametric bootstrapping [22] is probably the simplest, but slowest, procedure for obtaining an estimate of . For each cohort that provides details of X 2 , participants can be sampled with replacement providing a bootstrap sample, where for each sampled individual we record all their various details: their time to event, all covariates and note whether or not they were censored.

An analytical estimate, a
An approximate analytical estimate of is also possible. This procedure is akin to the approach suggested by Steyerberg et al. [7,8], as an algebraic connection between the fully and partially adjusted estimates is utilized. We first consider the linear regression case, in which an exact expression is possible, and then extend this to the proportional hazards model. Consider the anal- where the various parameters represent model intercepts; note that the regression of X 2 on X 1 is a multiple multivariate regression and therefore that and c denote a vector and a matrix, respectively. Evaluating , and equating terms in X 1 , provides b The corresponding identity also applies to maximum likelihood estimates, a result that may be proved by examining the normal equations resulting from the various linear regressions. In particular, usingˆ . Sinceˆ f 1 andb 2 are estimated coefficients from the regression of Y on X 1 and X 2 , their properties depend on the distribution of Y conditional on both X 1 and X 2 ; similarlyĉ 1 is a vector of estimated coefficients whose properties depend on the distribution of X 2 conditional on X 1 .
The random variablesˆ f 1 andb 2 are therefore functions of the random variable (Y |X 1 , X 2 ) and c 1 is a function of (X 2 |X 1 ). By definition, (Y |X 1 , X 2 ) and (X 2 |X 1 ) are independent and hencê . Thus using the approximation E[ĉ 1 ] ≈ĉ 1 , the covariance of the fully and partially adjusted estimates can also be obtained approximately, and is evaluated as Since all the entries in theˆ vectors in the right-hand side of (7) are from the full model, i.e. the model including X 2 as a covariate, their variances and covariances may be obtained when fitting this model using any standard method. The above applies to a linear regression, but in our application we assume the proportional hazards model for survival time T where, in addition to X 1 and X 2 , we have stratifying variables X S . We assume a linear regression for denotes the baseline survivor function for the full model, stratified by X S as before. Interpreting P(T >t) as the expectation of the event T >t, we now use the iterated expectation formula and where other terms have been absorbed into the baseline hazard function.
We therefore suggest thatˆ p 1 ≈ˆ f 1 +b 2ĉ1 be used as an approximation and hence that (7) be used to obtain the within-cohort covariance as for linear regression. An estimate of Cov(ˆ p 1 ,ˆ f 1 ) can be therefore be obtained and the within-cohort correlation a can be obtained as

Modified analytic correlations, m
A potential difficulty is that the analytic approach in Section 4.2 provides no assurance that the correlations lie between −1 and 1; as shown below, three fibrinogen cohorts provide analytic correlations that are slightly greater than one.
A modification of the analytical approach that avoids such estimates can be developed by defining The simplest way to evaluate this variance is to derive the covariance matrix of the six terms that comprise the summa-tionˆ f 1 +b 2ĉ1 and evaluate Var(ˆ f 1 +b 2ĉ1 ) directly from this. Noting thatĉ 1 is independent ofˆ f 1 andb 2 , the entries of this covariance matrix can be evaluated using the identity Cov( , assuming that A and B are independent; expected values are approximated by point estimates and the necessary covariances are estimated when fitting the full proportional hazards and the multiple linear regression models. Note that we continue to use the direct estimate of Var(ˆ p 1 ) not Var(ˆ f 1 +b 2ĉ1 ) for the variance 2 1 in order to follow the convention that within-cohort variances are obtained using standard methods.

Comparison of the procedures for estimating
The procedures for estimating provide contrasting approaches. In particular, the bootstrap is computationally expensive, requiring considerable resampling and the repeated fitting of models involving large numbers of participants. Estimates of for the 14 fibrinogen cohorts can however be obtained in minutes, rather than hours, using several hundred bootstrap replications. A technicality here is that the resampling almost inevitably results in ties; Efron's method [23] was used for handling these, although other standard methods also provide very similar estimates of for the fibrinogen data.
The limitation of the analytical approaches is that they involve an algebraic approximation, and it is difficult to ascertain how accurate this is. The analytical approaches should be used only when exactly the same participants are used to fit both full and partial models, as this is required so thatb p 1 =b f 1 +b 2ĉ in the analogous linear regression, which motivates the approximation. For example, some participants in the 14 fibrinogen cohorts that provide details of X 2 have some missing covariates in X 2 but provide complete information for X 1 ; including these participants when fitting partial models but then omitting them in full models invalidates the theory. A further issue raised by the analytic approaches is that it is required that the partial model involves a subset of covariates from the full model. These issues do not present problems for the bootstrap procedure.
To summarize, no single method is preferable to the others on all grounds, so they are compared in Section 6.

NUMERICAL IMPLEMENTATION
R software was used throughout. The 'survival' package was used to fit all the necessary proportional hazards models and the 'sample' command was used to sample random rows, with replacement, from the data frame for the bootstrap replications required when evaluating b . Having estimated the within-cohort correlations, following van Houwelingen et al. [24], maximum likelihood estimation was performed and confidence intervals were obtained from the profile loglikelihood. It should however be noted that alternative estimation procedures, such as restricted maximum likelihood (REML), are also possible but are unlikely to make much difference here as the sample size is relatively large.
The necessary maximizations of the log-likelihood (6) were performed numerically using the 'optim' command, with the quasi-Newton 'BFGS' method, after transforming the variance and correlation parameters so that the transformed values lie along the whole real line. By specifying particular parameters as further arguments to be passed to the log-likelihood, these can be constrained to particular values and hence profile log-likelihoods can be also be obtained numerically. The command 'fdHess', from the 'nlme' package, provides a Hessian matrix that can be evaluated at the maximum likelihood estimates and then inverted in order to produce the observed information matrix from which standard errors can be obtained. When maximizing the resulting log-likelihoods in this manner, it was necessary to reduce the 'ndeps' vector, which denotes the step sizes for the finite-difference approximation to the gradient, from its default value of 10 −3 to 10 −5 . Although the fibrinogen data are not freely available, illustrative R code for performing the analysis is available upon request from the first member of the writing committee.

SIMPLE ANALYSES OF THE FIBRINOGEN DATA
Three complications arise with this database. First, some individuals in 'complete' cohorts do not in fact have complete data on X 2 . In order to apply the analytic methods for obtaining correlations in Section 4, we exclude these individuals from the estimation of both full and partial models ('complete-case analysis') but in Section 7 we alternatively include them in the partial model. Second, many 'partial' cohorts have data on some variables in X 2 , as shown in Table I. We ignore this information but consider its use in the discussion. Third, the partial model previously fitted was not quite a submodel of the full model, as only the partial model includes total cholesterol as a covariate. In order to apply the methods of Section 4 we drop total cholesterol from the partial model in this section; in Section 7 we alternatively include total cholesterol in this model.

Cohort-specific estimates for the fibrinogen data
The estimatesˆ p 1 andˆ f 1 of the effect of fibrinogen level, their within-cohort standard errors, 1 and 2 , and the correlations described above, for the 14 cohorts that provide the necessary information, are shown in Table II. The estimates and variances were obtained from standard proportional hazards model output and the correlations were obtained using the bootstrap (Section 4.1, with 500 bootstrap replications), the analytical (Section 4.2) and the modified analytical (Section 4.3) procedures. A 95 per cent confidence interval for a bootstrap within-cohort correlation of 0.95, based on Fisher's transformation, is (0.941, 0.958), indicating that 500 bootstrap replications are sufficient to accurately estimate the correlations.
All values of in Table II are large and positive, reflecting the similar nature of the two types of estimates. However, it is interesting to note that values of a are generally greater than the corresponding values obtained by bootstrapping and, as noted above, three of these were estimated to be very slightly greater than one. These were truncated to 0.999. For scenarios where the estimates are not so highly correlated, this truncation is unlikely to be necessary suggesting that this analytical solution is likely to perform more satisfactorily in such cases.   The trend in Figure 1 appears linear, lending credence to the assumption of bivariate normality across cohorts. Almost all the confidence regions overlap indicating that the results are broadly comparable across cohorts. However, it should not be supposed from this plot that the random effects model in (4) is not required and that a fixed effects model is appropriate: the univariate 2 heterogeneity statistics for the fully and partially adjusted estimates shown in Figure 1 are 17.8 (I 2 = 0.27) and 28.3 (I 2 = 0.55), on 13 degrees of freedom, respectively. Although the first of these 2 statistics is not significant, the second of these provides a p-value of 0.008 when testing the null hypothesis that the partially adjusted estimates are homogenous.

The pairs of estimatesˆ
The estimatesˆ p 1 from the remaining 17 cohorts are shown in Table III. Cohort 31, although providing an apparently unusual estimated effect is also much the smallest cohort (only 418 participants), and there are no obvious signs of outliers. Removing this very small cohort makes virtually no difference to the resulting inferences, although very slightly larger effects of fibrinogen level are obtained if this is discarded.

A complete-case meta-analysis using a
In this section the analytic within-cohort correlations are used. The implications of using other within-cohort correlations are explored in Section 6.3.
The profile log-likelihood was adopted to make inferences about f . In computing this, we fix the value f and maximize the log-likelihood (6) over the remaining four parameters, subject to the constraints that 1 0, 2 0 and −1 1. This profile log-likelihood is shown in Figure 2 Table I. is (0.223, 0.332). Figure 2 masks an important finding thatˆ = 1 and hence this parameter estimate is located at the edge of the parameter space; indeed this is the case for all the analyses described below. Furthermore the values of required to maximize the log-likelihood for all the values of f used to draw Figure 2 are greater than 0.9999. The corresponding profile log-likelihood plot in terms of is shown in Figure 3, which shows that there is strong evidence for a large and positive between-cohort correlation.
This finding presents difficulties when using information matrices to obtain confidence intervals, although by constraining = 1 we may easily obtain confidence intervals in this way. This comment applies for all the analyses below. This finding is not particularly surprising, given the analogous nature of fully and partially adjusted estimates and the finding of Riley et al. [6] that estimates of between-cohort correlation frequently lie at the edge of the parameter space. In particular, there is not a very large number of cohorts providing information concerning this parameter, and reasonably large within-cohort variances, relative to the between-cohort variation, which are features of data that tend to give rise toˆ ±1 [6].

Comparison of results
The results for the different estimates of are shown in Table IV, where results for both the fully and partially adjusted effect of fibrinogen level are shown and 'modified' refers to the modified analytic within-cohort correlations described in Section 4.3. Estimates of are not tabulated here because, as noted above,ˆ = 1 for all models fitted.
Constraining = 1 and obtaining confidence intervals using this reduced model, and the observed information matrix, provides very similar inferences for the effect of fibrinogen level as when using the profile log-likelihood and avoids the difficulty thatˆ lies at the edge of the parameter space. The standard errors of all estimates in Table IV are shown in parentheses, using the observed information matrix and this reduced model. It should be noted that the extreme estimate of tends to lead to inflated estimates of between-cohort variance [6] but these are not insubstantial and all the various models in Table IV provide similar results. It is interesting to note that estimates of between-cohort variance of the partially adjusted effects are much greater than the corresponding estimates for the fully adjusted effects, suggesting that the additional confounders incorporated into the full model explain some of the heterogeneity in the estimates of the effect of fibrinogen level. We conclude that the choice of method for estimating is not important in these data. Using the bootstrap within-cohort correlations provides a 95 per cent confidence interval of (1.24, 1.38) for the fully adjusted hazard ratio for fibrinogen level. The average fibrinogen level in the sample is 3.02 and the upper and lower quartiles are 2.47 and 3.47, respectively, indicating that participants with fibrinogen levels in the top quartile are at considerably more risk of a coronary heart disease event than those in the lower quartile.

Comparison with analyses of 'full' cohorts
The bivariate model gives estimated fully adjusted effects of fibrinogen in the range 0.259-0.275, with standard errors of around 0.027. For comparison, a simple univariate random-effects metaanalysis, using just the 14 cohorts that provide the necessary information (and hence using just the data in columns 2 and 3 of Table II), provides a point estimate ofˆ f = 0.273 with a standard error of 0.038. Furthermore, a standard bivariate random-effects meta-analysis using just these 14 cohorts and the bootstrap within-cohort correlations (i.e. using only the data in columns 2-6 of Table II) givesˆ f = 0.282 with a standard error, obtained in the same way as in Table IV, of 0.041. These much larger standard errors indicate that the extra information incorporated into the model developed here has been worthwhile in estimating the effect of fibrinogen, as the reduction in the standard error using the proposed procedure is around 30 per cent. It is also interesting to note that the standard error resulting from using the bivariate random-effects model for just the 14 cohorts that provide details of X 2 is very similar to that from the usual univariate meta-analysis of the fully adjusted effects. This indicates that little or no 'borrowing of strength' from unadjusted estimates of effect occurs for this example unless partially adjusted estimates from the remaining 17 cohorts are also used in the analysis. This is not surprising, as a number of previous articles highlight that, given complete data, there is little benefit of bivariate over univariate meta-analysis [6,25,26].

EXTENDED ANALYSES
The analyses in the previous section are not entirely in the spirit of those performed previously [15]. First, some individuals in 'full' cohorts had incomplete X 2 . We excluded these individuals from estimation of both full and partial models in Section 6, which we called a complete-case analysis. Now we will follow the previous analysis by including these individuals in estimating the partial model.
Second, we have omitted total cholesterol from the partial model. Now we follow the previous analysis by including it.
These changes invalidate the assumptions underlying the analytic approaches for estimating . We therefore use bootstrap within-cohort correlations in this section. Here the bootstrap replications simply include total cholesterol and incomplete cases when fitting partial models but are otherwise obtained as before. Making the above changes slightly modifies the cohort-specific values ofˆ p 1 and the accompanying standard errors, but leavesˆ f 1 unchanged. Following the same procedure as before providesˆ f = 0.259 and a 95 per cent confidence interval from the profile log-likelihood is (0.208, 0.314). This is broadly similar to the analyses above but a slightly smaller effect of fibrinogen is inferred; very similar inferences are also made by assuming that = 1 and using the observed information matrix, as shown by the second set of results in Table V.

Comparison of results
The choices concerning whether or not to include incomplete cases and total cholesterol when fitting partial models are somewhat arbitrary and the implications of these decisions are now explored. The results for models that exclude both total cholesterol and incomplete cases have been summarized in Table IV; the corresponding results for the other three possibilities, using bootstrapping to obtain the within-cohort correlations, are summarized in Table V. A comparison of Tables IV and V indicates that the inferences are not particularly sensitive to these choices when fitting partial models.

A simplified model
A plot similar to Figure 1, but with the difference between fully and partially adjusted estimates on the horizonal axis, and where partial models include total cholesterol, is shown in Figure 4. The differences between the fully and partially adjusted estimates (horizontal axis of Figure 4) appear homogenous, and the usual 2 heterogeneity statistic of these differences is just 8.0 on 13 degrees Difference between fully and partially adjusted log hazard ratios for fibrinogen level Partially adjusted log hazard ratio for fibrinogen level Figure 4. The difference between fully and partially adjusted, and partially adjusted, estimated effects of fibrinogen level and corresponding 95 per cent confidence intervals. Note that the partially adjusted estimates shown here adjust for total cholesterol, and hence are not quite the same as those shown in Figure 1 or Table I. of freedom. This apparent homogeneity is however highly sensitive to the estimated within-cohort correlations, so although not too much emphasis should be placed on this finding, this does suggest a simpler model that adequately describes the data. The point estimateˆ = 1 is obtained for all the various models, as noted above, and assuming that both = 1 and Var( f 1 − p 1 ) = 0 is equivalent to assuming that both = 1 and 2 1 = 2 2 in model (5). Hence a much simpler model that appears to describe the data well is model (5) subject to these two constraints.
Using the data shown in Figure 4, this model provides a point estimateˆ f = 0.259 with a standard error of 0.027. This is very similar inference to the analyses performed above, so little is gained by this simplification: the change in deviance of this model, compared with the full model where , 2 1 and 2 2 are allowed to take any value in their joint parameter space, is just 1.1 on 2 degrees of freedom.

CONCLUSIONS
Observational studies are likely to differ in terms of the information that they provide and are particularly susceptible to the influence of confounders. Provided that fully and partially adjusted estimates are kept distinct however, and assuming that at least some studies provide enough information to produce fully adjusted estimates, the methodology we have developed can be used or modified to incorporate data from all available cohorts. Similar methods are likely to be useful in other individual participant data (IPD) meta-analyses such as the million-person Emerging Risk Factors Collaboration [27].

THE FIBRINOGEN STUDIES COLLABORATION
Various extensions of the model are possible. For example, this kind of procedure could be used for other types of outcome, such as continuous and binary variables, or for other types of study, such as case-control studies. Furthermore, the fibrinogen data were analysed as providing two types of cohorts: those that provide adequate details of all the extra variables and those that do not. In other scenarios there may be much more complicated patterns of missingness and Table I shows that this distinction between the two types of fibrinogen cohort is a simplification. In other examples, further dimensions may be needed than the bivariate model suggested here allows. For example, one could fit models containing X 1 , X 1 and X 2 , X 1 and X 3 and X 1 , X 2 and X 3 , and perform a four-dimensional meta-analysis. Whether the additional model complexity is justified by increases in precision is a topic for further research.
An alternative way to estimate the within-cohort correlation is to estimate the unadjusted and adjusted Cox regressions simultaneously. Individuals from cohorts with complete data each contribute two records to this analysis, whereas individuals from other cohorts contribute only one record. The within-cohort correlation is obtained from the robust variance-covariance matrix of Wei et al. [28], which allows for dependence of different records on the same individual. This method provided similar correlations and inferences to those in Tables II, IV and V (results not shown). For situations where they are computationally feasible, the bootstrap correlations are perhaps preferable in practice because no linear approximation is required to derive them. Even if within-cohort correlations provided by alternative methods differed more notably, a recent simulation study concludes that a variety of types of errors when approximating within-cohort correlations have little impact on the estimation of the population means if there is complete data [25]. Here we borrow strength from 17 cohorts with incomplete data, however, and the correlation between the partially and fully adjusted estimates is crucial to the procedure adopted.
A recent proposal [29] for fitting a bivariate normal meta-analysis model, without estimating the within-cohort correlations, was not adopted as this does not separately reflect the within and between-cohort correlations, on which the borrowing of strength so critically relies when analysing the fibrinogen data in this way. The within-cohort correlations can be obtained with effort and 'If practitioners are fortunate to have the within-study correlations available, or if they can be assumed zero, then we recommend that they still perform a bivariate random-effects meta-analysis using the general model' [29] (model (5), as used here). Despite this, borrowing of strength can mostly be achieved without separating the within and between-cohort correlations and a referee pointed out that fitting this alternative model resulted in a similar estimate and confidence interval for the fully adjusted effect of fibrinogen. Those who wish to avoid estimating the within-cohort correlations altogether therefore have a viable alternative to consider.
Simply assuming that all the within-cohort correlations = 1 provides very similar inferences for the fibrinogen data, and one might consider assuming = 1 in order to quickly obtain some indicative results. Reparameterizations of the model, in order to avoid estimatesˆ = 1, were also considered. In particular, a bivariate normal model forˆ  1 )/2 was examined but this providedˆ = −1 for almost all the types of analyses described above, and hence did not avoid the difficulties associated with estimating this correlation. A further extension is to consider the possibility of addressing measurement error when measuring fibrinogen levels [30].
Our method assumes that confounders unmeasured in a particular cohort are MAR: that is, there is no systematic difference inˆ f 1 between studies that do and do not measure X 2 , once we allow for differences inˆ By contrast, the standard univariate method of analysing the cohorts that measured X 2 assumes that the confounders are missing completely at random: that is, there is no systematic difference inˆ f 1 between cohorts that do and do not measure X 2 . This seems much less plausible, since for example the studies that did not measure X 2 may be earlier studies that used alternative methods including different ways to measure the exposure of interest.
Combining unadjusted and adjusted estimates was previously proposed by Steyerberg and colleagues [7,8]. They started with just three estimates: unadjusted and fully adjusted estimates from IPD, and an unadjusted estimate from published literature. Their proposed estimate of the fully adjusted coefficient equals the fully adjusted estimate from the IPD plus an 'adaptation factor' times the difference between the unadjusted estimates. The adaptation factor is computed from the observed standard errors and within-study correlations to minimize the variance of the resulting 'adapted estimate'. One could extend this approach here, obtaining fully adjusted 'individual participant' estimates by conducting a bivariate meta-analysis using the 14 cohorts that provide full confounder information, and replacing the unadjusted estimate from published literature with the result of a univariate meta-analysis of the remaining 17 cohorts. If fixed-effect meta-analyses are performed then the adapted estimate turns out to be the same as a fixed-effect version of our procedure (i.e. setting 1 = 2 = 0 in equation (4)). This equivalence is lost if random-effects models are used. Our method has the advantages of greater transparency, because the model is clearly stated, and greater statistical efficiency because 2 2 is assumed equal across the two subsets of cohorts. Our procedure also has the benefit of ease of computation, because only one bivariate random-effects meta-analysis is needed, and facilitates an entirely likelihood-based approach.
The methodology developed here could also be applied if some cohorts provide IPD while others provide only aggregate results, for example just one of the estimated effects and accompanying standard error. This is the scenario specifically considered by Steyerberg et al. [7,8] and Riley et al. [31] provide a recent review of such methods. If necessary, one might assume that aggregate within-cohort correlations are comparable to those where the IPD are available. The assumption that estimated effects are MAR becomes a much stronger assumption when using aggregate data in this way however, as the publication of a particular analysis might depend on the results obtained, rather than just the availability of the covariates.