Meta-analyses are often used to synthesize the findings of studies examining the correlational relationship between two continuous variables. When only dichotomous measurements are available for one of the two variables, the biserial correlation coefficient can be used to estimate the product–moment correlation between the two underlying continuous variables. Unlike the point-biserial correlation coefficient, biserial correlation coefficients can therefore be integrated with product–moment correlation coefficients in the same meta-analysis. The present article describes the estimation of the biserial correlation coefficient for meta-analytic purposes and reports simulation results comparing different methods for estimating the coefficient's sampling variance. The findings indicate that commonly employed methods yield inconsistent estimates of the sampling variance across a broad range of research situations. In contrast, consistent estimates can be obtained using two methods that appear to be unknown in the meta-analytic literature. A variance-stabilizing transformation for the biserial correlation coefficient is described that allows for the construction of confidence intervals for individual coefficients with close to nominal coverage probabilities in most of the examined conditions. Copyright © 2016 John Wiley & Sons, Ltd.

]]>Although well developed to assess efficacy questions, meta-analyses and, more generally, systematic reviews, have received less attention in application to safety-related questions. As a result, many open questions remain on how best to apply meta-analyses in the safety setting. This appraisal attempts to: (i) summarize the current guidelines for assessing individual studies, systematic reviews, and network meta-analyses; (ii) describe several publications on safety meta-analytic approaches; and (iii) present some of the questions and issues that arise with safety data. A number of gaps in the current quality guidelines are identified along with issues to consider when performing a safety meta-analysis. While some work is ongoing to provide guidance to improve the quality of safety meta-analyses, this review emphasizes the critical need for better reporting and increased transparency regarding safety data in the systematic review guidelines. Copyright © 2016 John Wiley & Sons, Ltd.

]]>We describe a meta-analytic scatterplot that indicates precision of points for two variables paired within studies; this is equivalent in form to a ‘cross-hairs’ plot used to portray specificity and sensitivity in diagnostic testing. At the user's discretion, the plot also displays boxplots for each of the X and Y variable distributions, means for each of the variables, and the correlation between the two. The cross-hairs may be suppressed for dense point clouds. The program is written in R, so it can be modified by the user and can serve as a companion to existing meta-analysis programs. Some of the program's novel uses are described and illustrated with (1) independent effect sizes, (2) dependent effect sizes, and (3) shrunken estimates. Copyright © 2016 John Wiley & Sons, Ltd.

]]>Network meta-analysis is becoming a common approach to combine direct and indirect comparisons of several treatment arms. In recent research, there have been various developments and extensions of the standard methodology. Simultaneously, cluster randomized trials are experiencing an increased popularity, especially in the field of health services research, where, for example, medical practices are the units of randomization but the outcome is measured at the patient level. Combination of the results of cluster randomized trials is challenging. In this tutorial, we examine and compare different approaches for the incorporation of cluster randomized trials in a (network) meta-analysis. Furthermore, we provide practical insight on the implementation of the models. In simulation studies, it is shown that some of the examined approaches lead to unsatisfying results. However, there are alternatives which are suitable to combine cluster randomized trials in a network meta-analysis as they are unbiased and reach accurate coverage rates. In conclusion, the methodology can be extended in such a way that an adequate inclusion of the results obtained in cluster randomized trials becomes feasible. Copyright © 2016 John Wiley & Sons, Ltd.

]]>Meta-analyses in orphan diseases and small populations generally face particular problems, including small numbers of studies, small study sizes and heterogeneity of results. However, the heterogeneity is difficult to estimate if only very few studies are included. Motivated by a systematic review in immunosuppression following liver transplantation in children, we investigate the properties of a range of commonly used frequentist and Bayesian procedures in simulation studies. Furthermore, the consequences for interval estimation of the common treatment effect in random-effects meta-analysis are assessed. The Bayesian credibility intervals using weakly informative priors for the between-trial heterogeneity exhibited coverage probabilities in excess of the nominal level for a range of scenarios considered. However, they tended to be shorter than those obtained by the Knapp–Hartung method, which were also conservative. In contrast, methods based on normal quantiles exhibited coverages well below the nominal levels in many scenarios. With very few studies, the performance of the Bayesian credibility intervals is of course sensitive to the specification of the prior for the between-trial heterogeneity. In conclusion, the use of weakly informative priors as exemplified by half-normal priors (with a scale of 0.5 or 1.0) for log odds ratios is recommended for applications in rare diseases. © 2016 The Authors. Research Synthesis Methods published by John Wiley & Sons Ltd.

]]>Our study revisits and challenges two core conventional meta-regression estimators: the prevalent use of ‘mixed-effects’ or random-effects meta-regression analysis and the correction of standard errors that defines fixed-effects meta-regression analysis (FE-MRA). We show how and explain why an unrestricted weighted least squares MRA (WLS-MRA) estimator is superior to conventional random-effects (or mixed-effects) meta-regression when there is publication (or small-sample) bias that is as good as FE-MRA in all cases and better than fixed effects in most practical applications. Simulations and statistical theory show that WLS-MRA provides satisfactory estimates of meta-regression coefficients that are practically equivalent to mixed effects or random effects when there is no publication bias. When there is publication selection bias, WLS-MRA always has smaller bias than mixed effects or random effects. In practical applications, an unrestricted WLS meta-regression is likely to give practically equivalent or superior estimates to fixed-effects, random-effects, and mixed-effects meta-regression approaches. However, random-effects meta-regression remains viable and perhaps somewhat preferable if selection for statistical significance (publication bias) can be ruled out and when random, additive normal heterogeneity is known to directly affect the ‘true’ regression coefficient. Copyright © 2016 John Wiley & Sons, Ltd.

]]>The rapid review is an approach to synthesizing research evidence when a shorter timeframe is required. The implications of what is lost in terms of rigour, increased bias and accuracy when conducting a rapid review have not yet been elucidated.

We assessed the potential implications of methodological shortcuts on the outcomes of three completed systematic reviews addressing agri-food public health topics. For each review, shortcuts were applied individually to assess the impact on the number of relevant studies included and whether omitted studies affected the direction, magnitude or precision of summary estimates from meta-analyses.

In most instances, the shortcuts resulted in at least one relevant study being omitted from the review. The omission of studies affected 39 of 143 possible meta-analyses, of which 14 were no longer possible because of insufficient studies (<2). When meta-analysis was possible, the omission of studies generally resulted in less precise pooled estimates (i.e. wider confidence intervals) that did not differ in direction from the original estimate.

The three case studies demonstrated the risk of missing relevant literature and its impact on summary estimates when methodological shortcuts are applied in rapid reviews. © 2016 The Authors. *Research Synthesis Methods* Published by John Wiley & Sons Ltd.

Phase I trials aim to establish appropriate clinical and statistical parameters to guide future clinical trials. With individual trials typically underpowered, systematic reviews and meta-analysis are desired to assess the totality of evidence. A high percentage of zero or missing outcomes often complicate such efforts. We use a systematic review of pediatric phase I oncology trials as an example and illustrate the utility of advanced Bayesian analysis. Standard random-effects methods rely on the exchangeability of individual trial effects, typically assuming that a common normal distribution sufficiently describes random variation among the trial level effects. Summary statistics of individual trial data may become undefined with zero counts, and this assumption may not be readily examined. We conduct Bayesian semi-parametric analysis with a Dirichlet process prior and examine the assumption. The Bayesian semi-parametric analysis is also useful for visually summarizing individual trial data. It provides alternative statistics that are computed free of distributional assumptions about the shape of the population of trial level effects. Outcomes are rarely entirely missing in clinical trials. We utilize available information and conduct Bayesian incomplete data analysis. The advanced Bayesian analyses, although illustrated with the specific example, are generally applicable. © 2016 The Authors. Research Synthesis Methods Published by John Wiley & Sons Ltd.

]]>In meta-analysis, the random-effects model is often used to account for heterogeneity. The model assumes that heterogeneity has an additive effect on the variance of effect sizes. An alternative model, which assumes multiplicative heterogeneity, has been little used in the medical statistics community, but is widely used by particle physicists. In this paper, we compare the two models using a random sample of 448 meta-analyses drawn from the Cochrane Database of Systematic Reviews. In general, differences in goodness of fit are modest. The multiplicative model tends to give results that are closer to the null, with a narrower confidence interval. Both approaches make different assumptions about the outcome of the meta-analysis. In our opinion, the selection of the more appropriate model will often be guided by whether the multiplicative model's assumption of a single effect size is plausible. Copyright © 2016 John Wiley & Sons, Ltd.

]]>Random-effects meta-analysis methods include an estimate of between-study heterogeneity variance. We present a systematic review of simulation studies comparing the performance of different estimation methods for this parameter. We summarise the performance of methods in relation to estimation of heterogeneity and of the overall effect estimate, and of confidence intervals for the latter. Among the twelve included simulation studies, the DerSimonian and Laird method was most commonly evaluated. This estimate is negatively biased when heterogeneity is moderate to high and therefore most studies recommended alternatives. The Paule–Mandel method was recommended by three studies: it is simple to implement, is less biased than DerSimonian and Laird and performs well in meta-analyses with dichotomous and continuous outcomes. In many of the included simulation studies, results were based on data that do not represent meta-analyses observed in practice, and only small selections of methods were compared. Furthermore, potential conflicts of interest were present when authors of novel methods interpreted their results. On the basis of current evidence, we provisionally recommend the Paule–Mandel method for estimating the heterogeneity variance, and using this estimate to calculate the mean effect and its 95% confidence interval. However, further simulation studies are required to draw firm conclusions. Copyright © 2016 John Wiley & Sons, Ltd.

]]>In a network meta-analysis, comparators of interest are ideally connected either directly or *via* one or more common comparators. However, in some therapeutic areas, the evidence base can produce networks that are disconnected, in which there is neither direct evidence nor an indirect route for comparing certain treatments within the network. Disconnected networks may occur when there is no accepted standard of care, when there has been a major paradigm shift in treatment, when use of a standard of care or placebo is debated, when a product receives orphan drug designation, or when there is a large number of available treatments and many accepted standards of care. These networks pose a challenge to decision makers and clinicians who want to estimate the relative efficacy and safety of newly available agents against alternatives. A currently recommended approach is to insert a distribution for the unknown treatment effect(s) into a network meta-analysis model of treatment effect. In this paper, we describe this approach along with two alternative Bayesian models that can accommodate disconnected networks. Additionally, we present a theoretical framework to guide the choice between modeling approaches. This paper presents researchers with the tools and framework for selecting appropriate models for indirect comparison of treatment efficacies when challenged with a disconnected framework. Copyright © 2016 John Wiley & Sons, Ltd.

In prognostic studies, a summary statistic such as a hazard ratio is often reported between low-expression and high-expression groups of a biomarker with a study-specific cutoff value. Recently, several meta-analyses of prognostic studies have been reported, but these studies simply combined hazard ratios provided by the individual studies, overlooking the fact that the cutoff values are study-specific. We propose a method to summarize hazard ratios with study-specific cutoff values by estimating the hazard ratio for a 1-unit change of the biomarker in the underlying individual-level model. To this end, we introduce a model for a relationship between a reported log-hazard ratio for a 1-unit expected difference in the mean biomarker value between the low-expression and high-expression groups, which approximates the individual-level model, and propose to make an inference of the model by using the method for trend estimation based on grouped exposure data. Our combined estimator provides a valid interpretation if the biomarker distribution is correctly specified. We applied our proposed method to a dataset that examined the association between the biomarker Ki-67 and disease-free survival in breast cancer patients. We conducted simulation studies to examine the performance of our method. Copyright © 2016 John Wiley & Sons, Ltd.

]]>When considering data from many trials, it is likely that some of them present a markedly different intervention effect or exert an undue influence on the summary results. We develop a forward search algorithm for identifying outlying and influential studies in meta-analysis models. The forward search algorithm starts by fitting the hypothesized model to a small subset of likely outlier-free studies and proceeds by adding studies into the set one-by-one that are determined to be closest to the fitted model of the existing set. As each study is added to the set, plots of estimated parameters and measures of fit are monitored to identify outliers by sharp changes in the forward plots. We apply the proposed outlier detection method to two real data sets; a meta-analysis of 26 studies that examines the effect of writing-to-learn interventions on academic achievement adjusting for three possible effect modifiers, and a meta-analysis of 70 studies that compares a fluoride toothpaste treatment to placebo for preventing dental caries in children. A simple simulated example is used to illustrate the steps of the proposed methodology, and a small-scale simulation study is conducted to evaluate the performance of the proposed method. Copyright © 2016 John Wiley & Sons, Ltd.

]]>When meta-analysing intervention effects calculated from continuous outcomes, meta-analysts often encounter few trials, with potentially a small number of participants, and a variety of trial analytical methods. It is important to know how these factors affect the performance of inverse-variance fixed and DerSimonian and Laird random effects meta-analytical methods. We examined this performance using a simulation study.

Meta-analysing estimates of intervention effect from final values, change scores, ANCOVA or a random mix of the three yielded unbiased estimates of pooled intervention effect. The impact of trial analytical method on the meta-analytic performance measures was important when there was no or little heterogeneity, but was of little relevance as heterogeneity increased. On the basis of larger than nominal type I error rates and poor coverage, the inverse-variance fixed effect method should not be used when there are few small trials.

When there are few small trials, random effects meta-analysis is preferable to fixed effect meta-analysis. Meta-analytic estimates need to be cautiously interpreted; type I error rates will be larger than nominal, and confidence intervals will be too narrow. Use of trial analytical methods that are more efficient in these circumstances may have the unintended consequence of further exacerbating these issues. © 2015 The Authors. Research Synthesis Methods published by John Wiley & Sons, Ltd.

Goodness of fit evaluation should be a natural step in assessing and reporting dose–response meta-analyses from aggregated data of binary outcomes. However, little attention has been given to this topic in the epidemiological literature, and goodness of fit is rarely, if ever, assessed in practice. We briefly review the two-stage and one-stage methods used to carry out dose–response meta-analyses. We then illustrate and discuss three tools specifically aimed at testing, quantifying, and graphically evaluating the goodness of fit of dose–response meta-analyses. These tools are the deviance, the coefficient of determination, and the decorrelated residuals-versus-exposure plot. Data from two published meta-analyses are used to show how these three tools can improve the practice of quantitative synthesis of aggregated dose–response data. In fact, evaluating the degree of agreement between model predictions and empirical data can help the identification of dose–response patterns, the investigation of sources of heterogeneity, and the assessment of whether the pooled dose–response relation adequately summarizes the published results. © 2015 The Authors. *Research Synthesis Methods* published by John Wiley & Sons, Ltd.

This paper investigates how inconsistency (as measured by the *I ^{2}* statistic) among studies in a meta-analysis may differ, according to the type of outcome data and effect measure. We used hierarchical models to analyse data from 3873 binary, 5132 continuous and 880 mixed outcome meta-analyses within the Cochrane Database of Systematic Reviews. Predictive distributions for inconsistency expected in future meta-analyses were obtained, which can inform priors for between-study variance. Inconsistency estimates were highest on average for binary outcome meta-analyses of risk differences and continuous outcome meta-analyses. For a planned binary outcome meta-analysis in a general research setting, the predictive distribution for inconsistency among log odds ratios had median 22% and 95% CI: 12% to 39%. For a continuous outcome meta-analysis, the predictive distribution for inconsistency among standardized mean differences had median 40% and 95% CI: 15% to 73%. Levels of inconsistency were similar for binary data measured by log odds ratios and log relative risks. Fitted distributions for inconsistency expected in continuous outcome meta-analyses using mean differences were almost identical to those using standardized mean differences. The empirical evidence on inconsistency gives guidance on which outcome measures are most likely to be consistent in particular circumstances and facilitates Bayesian meta-analysis with an informative prior for heterogeneity. © 2015 The Authors. Research Synthesis Methods published by John Wiley & Sons, Ltd.

When conducting research synthesis, the collection of studies that will be combined often do not measure the same set of variables, which creates missing data. When the studies to combine are longitudinal, missing data can occur on the observation-level (time-varying) or the subject-level (non-time-varying). Traditionally, the focus of missing data methods for longitudinal data has been on missing observation-level variables. In this paper, we focus on missing subject-level variables and compare two multiple imputation approaches: a joint modeling approach and a sequential conditional modeling approach. We find the joint modeling approach to be preferable to the sequential conditional approach, except when the covariance structure of the repeated outcome for each individual has homogenous variance and exchangeable correlation. Specifically, the regression coefficient estimates from an analysis incorporating imputed values based on the sequential conditional method are attenuated and less efficient than those from the joint method. Remarkably, the estimates from the sequential conditional method are often less efficient than a complete case analysis, which, in the context of research synthesis, implies that we lose efficiency by combining studies. Copyright © 2015 John Wiley & Sons, Ltd.

]]>No abstract is available for this article.

]]>No abstract is available for this article.

]]>No abstract is available for this article.

]]>Pairwise meta-analysis is an established statistical tool for synthesizing evidence from multiple trials, but it is informative only about the relative efficacy of two specific interventions. The usefulness of pairwise meta-analysis is thus limited in real-life medical practice, where many competing interventions may be available for a certain condition and studies informing some of the pairwise comparisons may be lacking. This commonly encountered scenario has led to the development of network meta-analysis (NMA). In the last decade, several applications, methodological developments, and empirical studies in NMA have been published, and the area is thriving as its relevance to public health is increasingly recognized. This article presents a review of the relevant literature on NMA methodology aiming to pinpoint the developments that have appeared in the field. Copyright © 2016 John Wiley & Sons, Ltd.

]]>The performance of a drug in a clinical trial setting often does not reflect its effect in daily clinical practice. In this third of three reviews, we examine the applications that have been used in the literature to predict real-world effectiveness from randomized controlled trial efficacy data. We searched MEDLINE, EMBASE from inception to March 2014, the Cochrane Methodology Register, and websites of key journals and organisations and reference lists. We extracted data on the type of model and predictions, data sources, validation and sensitivity analyses, disease area and software. We identified 12 articles in which four approaches were used: multi-state models, discrete event simulation models, physiology-based models and survival and generalized linear models. Studies predicted outcomes over longer time periods in different patient populations, including patients with lower levels of adherence or persistence to treatment or examined doses not tested in trials. Eight studies included individual patient data. Seven examined cardiovascular and metabolic diseases and three neurological conditions. Most studies included sensitivity analyses, but external validation was performed in only three studies. We conclude that mathematical modelling to predict real-world effectiveness of drug interventions is not widely used at present and not well validated. © 2016 The Authors Research Synthesis Methods Published by John Wiley & Sons Ltd.

]]>The GetReal consortium (“incorporating real-life data into drug development”) addresses the efficacy–effectiveness gap that opens between the data from well-controlled randomized trials in selected patient groups submitted to regulators and the real-world evidence on effectiveness and safety of drugs required by decision makers. Workpackage 4 of GetReal develops evidence synthesis and modelling approaches to generate the real-world evidence. In this commentary, we discuss how questions change when moving from the well-controlled randomized trial setting to real-life medical practice, the evidence required to answer these questions, the populations to which estimates will be applicable to and the methods and data sources used to produce these estimates. We then introduce the methodological reviews written by GetReal authors and published in *Research Synthesis Methods* on network meta-analysis, individual patient data meta-analysis and mathematical modelling to predict drug effectiveness. The critical reviews of key methods are a good starting point for the ambitious programme of work GetReal has embarked on. The different strands of work under way in GetReal have great potential to contribute to making clinical trials research as relevant as it can be to patients, caregivers and policy makers. Copyright © 2016 John Wiley & Sons, Ltd.

Meta-analysis of a survival endpoint is typically based on the pooling of hazard ratios (HRs). If competing risks occur, the HRs may lose translation into changes of survival probability. The cumulative incidence functions (CIFs), the expected proportion of cause-specific events over time, re-connect the cause-specific hazards (CSHs) to the probability of each event type. We use CIF ratios to measure treatment effect on each event type. To retrieve information on aggregated, typically poorly reported, competing risks data, we assume constant CSHs. Next, we develop methods to pool CIF ratios across studies. The procedure computes pooled HRs alongside and checks the influence of follow-up time on the analysis. We apply the method to a medical example, showing that follow-up duration is relevant both for pooled cause-specific HRs and CIF ratios. Moreover, if all-cause hazard and follow-up time are large enough, CIF ratios may reveal additional information about the effect of treatment on the cumulative probability of each event type. Finally, to improve the usefulness of such analysis, better reporting of competing risks data is needed. Copyright © 2015 John Wiley & Sons, Ltd.

]]>Whilst it is common in clinical trials to use the results of tests at one phase to decide whether to continue to the next phase and to subsequently design the next phase, we show that this can lead to biased results in evidence synthesis. Two new kinds of bias associated with accumulating evidence, termed ‘sequential decision bias’ and ‘sequential design bias’, are identified. Both kinds of bias are the result of making decisions on the usefulness of a new study, or its design, based on the previous studies. Sequential decision bias is determined by the correlation between the value of the current estimated effect and the probability of conducting an additional study. Sequential design bias arises from using the estimated value instead of the clinically relevant value of an effect in sample size calculations. We considered both the fixed-effect and the random-effects models of meta-analysis and demonstrated analytically and by simulations that in both settings the problems due to sequential biases are apparent. According to our simulations, the sequential biases increase with increased heterogeneity. Minimisation of sequential biases arises as a new and important research area necessary for successful evidence-based approaches to the development of science. © 2015 The Authors. *Research Synthesis Methods* Published by John Wiley & Sons Ltd.

We present an alternative to the contrast-based parameterization used in a number of publications for network meta-analysis. This alternative “arm-based” parameterization offers a number of advantages: it allows for a “long” normalized data structure that remains constant regardless of the number of comparators; it can be used to directly incorporate individual patient data into the analysis; the incorporation of multi-arm trials is straightforward and avoids the need to generate a multivariate distribution describing treatment effects; there is a direct mapping between the parameterization and the analysis script in languages such as WinBUGS and finally, the arm-based parameterization allows simple extension to treatment-specific random treatment effect variances.

We validated the parameterization using a published smoking cessation dataset. Network meta-analysis using arm- and contrast-based parameterizations produced comparable results (with means and standard deviations being within +/− 0.01) for both fixed and random effects models. We recommend that analysts consider using arm-based parameterization when carrying out network meta-analyses. © 2015 The Authors *Research Synthesis Methods* Published by John Wiley & Sons Ltd.

An unobserved random effect is often used to describe the between-study variation that is apparent in meta-analysis datasets. A normally distributed random effect is conventionally used for this purpose. When outliers or other unusual estimates are included in the analysis, the use of alternative random effect distributions has previously been proposed. Instead of adopting the usual hierarchical approach to modelling between-study variation, and so directly modelling the study specific true underling effects, we propose two new marginal distributions for modelling heterogeneous datasets. These two distributions are suggested because numerical integration is not needed to evaluate the likelihood. This makes the computation required when fitting our models much more robust. The properties of the new distributions are described, and the methodology is exemplified by fitting models to four datasets. © 2015 The Authors. *Research Synthesis Methods* published by John Wiley & Sons, Ltd.

In this note, we clarify and prove the claim made Higgins *et al*. () that the design-by-treatment interaction model contains all possible loop inconsistency models. This claim provides a strong argument for using the design-by-treatment interaction model to describe loop inconsistencies in network meta-analysis. © 2015 The Authors. *Research Synthesis Methods* published by John Wiley & Sons, Ltd.

Individual participant data (IPD) is the backbone of scientific inquiry and important to a meta-analysis for a variety of reasons. It is therefore important to be able to access IPD, and yet, obstacles persist that make it difficult for meta-analysts, as well as interested primary study analysts, to obtain it. In this paper, we discuss the barriers to obtaining IPD via online repositories or contacting primary study authors and provide an example data sharing agreement that can be used to ameliorate a few of these issues. We also discuss the ethics of data sharing. The goal of this paper is to help meta-analysts anticipate these potential barriers at the outset of their studies and hopefully increase the likelihood of producing thorough IPD syntheses and foster collaborative partnerships with primary study researchers. Copyright © 2016 John Wiley & Sons, Ltd.

]]>