## 1. Introduction

### 1.1. Standard methods of meta-analysis and their limitations

Meta-analysis is often undertaken in two stages, even when individual participant data are available. At the first stage, each study is analysed to provide an estimate of the parameter of interest, together with its standard error. At the second stage, the estimates are combined across studies; in a random-effects meta-analysis, potential heterogeneity between the study-specific parameters is permitted (Higgins *et al.*, 2009).

Writing *x*_{i} as the estimate of parameter *θ*_{i} in study *i*, with the standard error denoted by *s*_{i}, the usual two-stage random-effects meta-analysis model is

within each study *i*, and

across studies *i*=1,…,*N*, where *σ*^{2} is the between-study heterogeneity variance. In practice, is estimated but then assumed to be without error in the above model, and normal distributions are assumed at both the first and the second stages. Although estimation of the overall parameters *μ* and *σ*^{2} can be by maximum likelihood or restricted maximum likelihood, a (non-iterative) moment estimator of *σ*^{2} is most often used in practice. The inference about *μ* is usually made by using an asymptotic normal approximation (i.e. asymptotic with respect to the number of studies), assuming that *σ*^{2} is fixed and known.

A Bayesian version of this model, in which study-specific and overall parameters are estimated simultaneously, can be implemented straightforwardly by using Markov chain Monte Carlo (MCMC) methods (Gelfand and Smith, 1990; Metropolis *et al.*, 1953; Hastings, 1970). This has several advantages: for example, uncertainty on all parameters, including *σ*^{2}, is acknowledged simultaneously, prior information may be incorporated (e.g. Smith *et al.* (1995)), a credible interval for *μ* can simply be taken from the quantiles of its estimated posterior distribution, with no asymptotic normal approximation needed, and although the normality assumption for the between-study model is usually retained, a more flexible distribution could be used in principle (Lee and Thompson, 2008).

The focus of this paper, however, is on exploiting individual participant data, where available, to avoid the need for two potentially limiting assumptions in the above model:

- (a) that the study-specific estimates are normally distributed;
- (b) that the associated uncertainties (variances) are known.

The former may be inappropriate for studies with relatively sparse data, or when the parameters of interest are unconventional. The latter is circumvented with individual participant data because the full uncertainty regarding study-specific parameters is naturally propagated into the between-study model, and feedback is allowed from the between-study model to the estimation of study-specific parameters. For simple data structures, a non-Bayesian analysis can be achieved by using linear mixed models for continuous outcomes, or generalized linear mixed models for binary outcomes. The inference about *μ*, however, is again usually made by using an asymptotic normal approximation, assuming that *σ*^{2} is fixed and known (Higgins *et al.*, 2001). Alternatively, a Bayesian analysis can be implemented using MCMC sampling. In addition to the advantages that were outlined above, MCMC methods can be used when the study-specific data structures are complex.

Meta-analyses that make use of individual participant data are currently less common than their aggregate data counterparts, but their application is on the rise, especially in medicine (Riley *et al.*, 2010; Thompson *et al.*, 2010). Riley *et al.* (2010) presented a graphical summary of the trend over time, which shows around 50 such analyses per year being published by 2008. The Cochrane library now contains over 70 such analyses.

### 1.2. Two-stage Bayesian methods

This paper focuses on analysis of the *full hierarchical model*, in which the individual participant data are used to estimate study-specific and overall parameters simultaneously. A two-stage strategy, in which study-specific parameters are estimated separately in stage 1, is very attractive in several situations, however. In this paper we propose a novel method for fitting the full hier archical model in two stages. The idea is to fit a model to each study's data independently in stage 1. The resulting study-specific posterior distributions are then used as proposal distributions for the study-specific parameters in stage 2, where those parameters are assumed to arise from a common population distribution (with unknown mean and variance, say). We describe the approach in detail in Section 3 but outline here several scenarios in which it may be useful.

- (a)
*When study-specific analyses are complex and/or time consuming*: study-specific data structures may be complex, requiring study level hierarchical models, with complex and/or non-linear regressions, say. Different studies may require different models, with different parameterizations possibly (although there must, by definition, be common parameters of interest across studies). It may thus be cumbersome to assemble computer code for analysing all studies simultaneously. If study-specific analyses are time consuming then a simultaneous analysis may be prohibitively so. A two-stage approach allows the analyst to consider the studies one by one, tailoring each analysis to the individual study and directly addressing any study-specific issues that may arise, such as convergence difficulties in an MCMC simulation—if, for example, posterior correlations between parameters are large for some studies, necessitating*long*simulations, there is no need to apply the same ‘run length’ to all studies. In fact, it is quite natural initially to explore the studies separately anyway, to identify appropriate models, to ensure that study-specific inferences make sense and to establish a model for linking the studies together. - (b)
*When there are several models or parameters of interest to consider*: in cases where we wish to examine any relationships that may exist between the study-specific parameters and study level covariates, a two-stage approach allows these to be explored efficiently, without having to analyse the study-specific data repeatedly. Similarly, if there are several models to be entertained for fitting the study-specific data, these can be explored without having to fit the full hierarchical model. The effect of study-specific assumptions on overall inferences can then be readily explored in stage 2. Sometimes there may be multiple parameters of interest, such as predictive quantities for a range of prespecified conditions. Using MCMC methods for our study-specific analyses means that we can obtain study-specific inferences for*any*parameterization of interest simply by transforming the MCMC output. Overall inferences are then simply a matter of running a computationally efficient second stage for each parameter set of interest. - (c)
*When the parameters of interest are complex functions of the ‘natural’ parameters*: in such cases it may be cumbersome to express the likelihood in terms of the parameters of interest, which is a fundamental requirement for a one-stage analysis. Sometimes this may even be impossible, because we cannot invert, algebraically, the relationship between parameters of interest and natural parameters (those that the likelihood is naturally expressed in terms of), although this inversion could, in principle, be performed numerically. Either way, a one-stage analysis is then problematic. Our proposed two-stage method offers a convenient way around this problem, exploiting again the fact that study-specific inferences for any parameterization of interest can be obtained by transforming appropriate MCMC output.

The motivating data that we consider below exemplify all the above three scenarios. They require complex, study level hierarchical models, and we are interested in many complex functions of the natural parameters. We would not realistically have been able to perform such an analysis without the developed two-stage methodology.

Although the above motivation for our work is in terms of meta-analysis, it is likely that two-stage or multistage Bayesian methods would have a range of other applications that could be explored. For example, in population pharmacokinetics, a potentially complex non-linear regression is fitted to repeated measurements from each of a number of individuals (e.g. Lunn *et al.* (2002)). Interindividual variability among the resulting parameters can sometimes be partially explained by various individual level covariates, providing scope for individualized dosage regimens in the target population. A two-stage approach could expedite the search for important covariates.

This paper is aimed at both methodological and applied statisticians. The methods are described in sufficient detail that they may be straightforwardly implemented in a low level (or high level, e.g. R (Ihaka and Gentlemen, 1996)) language of choice, or extended to other application areas. For readers who are less interested in the methodological detail, an implementation of the approach within the BUGS software (Lunn *et al.*, 2000, 2009) has been developed. The structure of the paper is as follows. In Section 2 we describe illustrative data on the effect of diuretics on risk of pre-eclampsia during pregnancy, as well as motivating data relating to the growth and rupture of abdominal aortic aneurysms (AAAs) (Sweeting and Thompson, 2011). In Section 3 we describe, in detail, the two-stage fully Bayesian approach, highlighting both its extensions and its limitations. Section 4 presents analyses of the data described in Section 2, whereas Section 5 contains a concluding discussion. Details regarding the implementation of our method in BUGS are provided in an on-line appendix.