## Introduction

A meta-analysis involves taking data from more than one study and analyzing it to derive a pooled estimate, commonly referred to as a summary estimate. Meta-analyses can vary considerably from very simple to very complex, and there is a variety of different software packages available to aid the process. As with any good research, the most important thing is asking the right question. In the context of meta-analyses, my perspective is very broad in that I believe it is relevant to pool as much data as possible; i.e., for the most part, all apples are oranges and vice versa, and the extensive data set can then be used to try to explain any differences according to outcomes. The philosophy behind this approach is that if we expect that a drug class is going to provide a largely similar direction of effect across a broad range of patients, pooling data from studies of different types of drugs and different dosages of drugs is reasonable, and should facilitate subsequent subgroup analyses; although, it is most important to define the questions to be addressed a priori.

### Consistency of Effects

The consistency of the direction of effect across trials is the first parameter to consider regarding pooling of data. Ideally, studies examining interventions for a particular indication would all demonstrate a similar estimate of the size of the effect and meta-analysis would further corroborate this. However, as in reality this doesn't happen very often, a visual assessment of heterogeneity should be undertaken to decide whether or not meta-analysis is likely to be useful, and to determine which data to pool. To put it very simply, a meta-analysis of four studies, two of which exhibit a positive effect and two of which exhibit a harmful effect, is not going to be informative without some understanding of why these studies have such disparate results. In the “real world,” generally, some studies give positive results while others give negative results, but the overall direction of the effect tends to be relatively consistent (Fig. 1), and upon pooling the data, additional analyses based on prior knowledge can be pre-planned to help explore and explain the discrepancies.

### Assumptions Prior to Pooling Data

A very narrow or very broad approach can be taken to pooling data. In the latter approach, data may be pooled across a variety of different interventions and a variety of patient populations, which allows for extensive exploration of the data that may help to identify causes of variability in outcomes, and facilitate treatment comparisons. Causes in variability in outcomes across studies can be assessed in a variety of ways. For example, sensitivity analysis can be used to determine the impact of predefined parameters on the results of the meta-analysis. Taking statins for the secondary prevention of cardiovascular events as an example, studies primarily involving diabetic patients may be excluded from the pooled dataset because they are a particularly high-risk group that are expected to have worse outcomes than the general study population. Data pooling may also be influenced by the logistics for predefined subgroup analyses of specific patient populations, interventions or outcomes. Various tests can be applied to assess data heterogeneity, and to gauge the consistency of subgroup analyses. The most common of these is the I-squared statistic, a measure of the proportion of inconsistency of an analysis that cannot be explained by chance. It ranges between 0% and 100% with lower values representing less heterogeneity. Meta-regression, a more sophisticated method for investigating heterogeneity of effects across studies, examines the relationship between one or more study parameters and the sizes of effect observed in the studies; again, software is available to perform this function.

### Comprehensive Meta-Analysis—Statins

Two models commonly used in meta-analysis are the fixed-effects and random-effects models. The former makes the assumption that the individual specific effect is correlated with the independent variables; i.e., it assumes there is a single true value underlying all study results. The latter assumes that the individual specific effects are uncorrelated with the independent variables; by assuming greater variability across studies, greater weight is placed on smaller studies. In our hands, meta-analysis of data from 41 studies of the efficacy of statins in cardiovascular disease, involving more than 41,000 participants, led to an estimated relative risk reduction of all-cause mortality of 0.85 (confidence interval (CI); 0.81 to 0.90) with a fixed-effects model, and 0.83 (CI; 0.78 to 0.90) with the more conservative random-effects model. In general, because it is more conservative, the random-effects model is favored by statisticians; however, there are instances when the fixed-effects approach is more reasonable. For example, some of the statin trials included in the analysis had 10,000 participants, while others had less than 100. In this case, it could be assumed that the larger trials are of better quality than the smaller trials, so a fixed-effects model may therefore be more appropriate. Statins have been extensively studied in clinical trials and it is generally accepted that they are an effective drug class for the prevention of cardiovascular events in patients at increased risk of cardiovascular disease, or those with established cardiovascular disease. Consistent with this, both of our analytical models confirmed that statins are indeed an effective intervention that significantly reduces all-cause mortality. Notably, however, upon examining the data from the individual trials, it appears that in most cases, statins do not have a statistically significant effect on this endpoint (Fig. 2). So, despite the fact that many of these trials included thousands of participants, because of the relatively low number of events, many were underpowered to demonstrate an intervention effect of this nature.