## 1. Introduction

Now that meta-analysis is well established in medical statistics, it is perhaps easy to forget that, until relatively recently, its use has been considered controversial by the medical community 1, 2. In particular, Eysenck's provocative article, published in the *British Medical Journal* in 1994 3, still makes interesting reading today, and some might argue that the difficulties he identified have yet to be satisfactorily resolved. Issues like the quality of studies, nonlinear associations, and the debate between fixed and random effects meta-analyses, which Eysenck alludes to by referring to ‘Adding apples and oranges’, have subsequently received a great deal of attention and are points that anyone contemplating performing a meta-analysis should consider carefully. The second problem that Eysenck describes is that ‘effects are often multivariate rather than univariate’ and he notes, in the context of an example involving passive smoking, that meta-analysis ‘attempts a univariate type of analysis of a clearly multivariate problem’. We agree that medical studies often examine multiple, and correlated, outcomes of interest to the meta-analyst. A simple example is overall and disease-free survival.

The general problem is therefore to make inferences about correlated study effects, where each study estimates one or more of them and ideally provides the corresponding within-study covariance matrix. Not all studies may provide estimates of all effects of interest, so it is vitally important to handle missing data in a suitable way. We will describe the precise form of the multivariate random effects model in Section 3, and methods for fitting it in Section 4, but until then it is essential that the reader keeps the general problem firmly in mind. For a detailed account of the univariate methods that are extended here, see Normand's tutorial 4.

The variation in the studies' effects is separated into two components by the random effects model. The within-study variation refers to the variation in the repeated sampling of the studies' results if they were replicated, and the between-study variation refers to any variation in the studies' true underlying effects. Hence, we have both within- and between-study correlations in the multivariate random effects model. Within-study correlation occurs because different effects are calculated using the same set of patients. For example, if the effects of interest relate to desirable outcomes such as overall and disease-free survival status, then they will almost necessarily be positively correlated.

The between-study correlation allows the true underlying outcome effects to be correlated and hence the studies' effects to be more or less correlated than we would expect from the within-study variation alone. An obvious situation where the between-study correlation is important is the meta-analysis of diagnostic test accuracy. Here, within studies, the sensitivities and specificities are assumed to be independent because they are calculated using data from different individuals. Despite this, a negative correlation between these quantities across studies is likely 5 because studies that adopt less stringent criterion for declaring a test positive invoke higher sensitivities and lower specificities.

We assume that a ‘two-stage’ approach to analysis is adopted. At the first stage, (typically standard) analyses of each trial are performed, and estimates of parameters of interest are obtained; for example, in a survival study, the estimated hazard ratios of overall and disease-free survival. The within-study covariance matrices are also obtained at this stage, containing the variance of each effect and their covariances. These estimates are then combined at the second phase. If the estimates are obtained from published papers, as is typically the case, then a two-stage approach is necessary but if individual patient data (IPD) are available a one-stage approach is possible and may be preferable. One-stage methods for IPD random effects meta-analyses have been suggested for continuous 6, binary 7, ordinal 8 and time-to-event data 9. When the within-study model is relatively computationally complex, as is the case in survival modelling for example, and the data set is large, one-stage meta-analysis methods become computationally unfeasible 10 and a two-stage approach becomes necessary.

Considerable progress has recently been made in the development of multivariate meta-analysis and a tutorial paper 11 on multivariate meta-analysis and meta-regression appeared in *Statistics in Medicine* less than a decade later than Eysenck's article. This tutorial mainly focussed on the bivariate case where the outcome pairs are arm-specific measures. Hence, conditional on the study-specific true underlying measures, all effects are assumed to be independent. Although this special case is useful in some settings, applications have been found where this assumption is clearly implausible. More recently, investigations have examined the effect of misspecifying the within-study correlations 12, 13. In order to perform multivariate meta-analyses more generally, purpose-built software has been written to fit the multivariate random effects model 14 so that this can now be used routinely in conjunction with a variety of estimation methods. Hence, the weaponry is now firmly in place: all that has to be decided now is if, when and how to wield it. Multivariate meta-analysis has an abundance of potential and promise over its univariate counterpart. In particular, it can describe the associations between the estimates of effect in order to help make predictions about the true effects of a new study and provide estimates with better statistical properties, due to the borrowing of strength that it enables.

In order to raise awareness of the recent methodological developments, and the applications that motivated them, the authors of this article organized a one day ‘Multivariate Meta-Analysis’ event on 26th January 2010 at the Royal Statistical Society (RSS). The authors initially presented the theory, and the applications followed. (Diagnostic tests: Roger Harbord, Theo Stijnen. Multiple parameter models: Stephen Kaptoge, Ben Armstrong and Antonio Gasparrini, Dan Jackson. Selective outcome reporting:Paula Williamson.) This meeting resulted in considerable enthusiasm and encouragement but concerns and issues were also raised and we felt it timely to provide a balanced account of the discourse of the meeting. Riley 13 notes that, with the exception of diagnostic test studies, ‘multivariate meta-analysis methods are rarely used by practitioners in systematic reviews’. Hence, if the concerns outweigh the benefits, it may not be too late to stifle multivariate meta-analysis in the way that Egger and Smith 2 suggested that some may think meta-analysis *per se* should have been as recently as 1997.

In this article we proceed as follows. In Section 2 we describe the areas of application that motivated multivariate methods. In Section 3 we discuss the multivariate random effects model and its assumptions. In Section 4 we describe the estimation methods that have been developed. In Section 5 we apply the methods to our example data sets and discuss the advantages and limitations of the multivariate methods in relation to these. In Section 6 we tackle perhaps the greatest practical difficulty: handling the (frequently unknown) within-study correlations. We conclude our article with a discussion, which is followed with invited commentaries from some of those present at the RSS meeting and others with an interest in meta-analysis.