A design‐by‐treatment interaction model for network meta‐analysis and meta‐regression with integrated nested Laplace approximations
Abstract
Network meta‐analysis (NMA) is gaining popularity for comparing multiple treatments in a single analysis. Generalized linear mixed models provide a unifying framework for NMA, allow us to analyze datasets with dichotomous, continuous or count endpoints, and take into account multiarm trials, potential heterogeneity between trials and network inconsistency. To perform inference within such NMA models, the use of Bayesian methods is often advocated. The standard inference tool is Markov chain Monte Carlo (MCMC), which is computationally expensive and requires convergence diagnostics. A deterministic approach to do fully Bayesian inference for latent Gaussian models can be achieved by integrated nested Laplace approximations (INLA), which is a fast and accurate alternative to MCMC. We show how NMA models fit in the class of latent Gaussian models and how NMA models are implemented using INLA and demonstrate that the estimates obtained by INLA are in close agreement with the ones obtained by MCMC. Specifically, we emphasize the design‐by‐treatment interaction model with random inconsistency parameters (also known as the Jackson model). Also, we have proposed a network meta‐regression model, which is constructed by incorporating trial‐level covariates to the Jackson model to explain possible sources of heterogeneity and/or inconsistency in the network. A publicly available R package, nmaINLA, is developed to automate the INLA implementation of NMA models, which are considered in this paper. Three applications illustrate the use of INLA for a NMA.
1 INTRODUCTION
Network meta‐analysis (NMA)1 or mixed treatment comparison,2 which is a generalization of the pairwise (2 treatments) meta‐analysis,3 allows us to compare multiple treatments, although they have not been evaluated directly in a single trial. In recent years, with the increasing number of alternative treatment options, NMA gains an increasing popularity especially in the medical literature.4 The NMA can be regarded as the next generation evidence synthesis tool or the new norm for comparative effectiveness research.5
Many statistical models and parametrizations have been proposed for NMA. The standard approach to NMA is the contrast‐based model where the information of the relative treatment effects, expressed for example as log odds ratios, is pooled over trials. An alternative approach is the arm‐based (AB) model6 where absolute effects on each treatment arm, for instance, log odds, are pooled. For further details of these 2 methods, we refer readers to Piepho,7 Dias, and Ades8 and Hong et al.9 In this paper, we exclusively consider contrast‐based modeling approaches, and we return to AB models in the discussion.
A contrast‐based model can be defined using a difference‐based likelihood or an arm‐based likelihood (not to be confused with AB models).8 The difference‐based likelihood approach uses a normal approximation to produce a summary estimate and its variance for each relative treatment effect. Also, if there are no events in at least one of the trial arms for datasets with dichotomous endpoints, a continuity correction is needed. On the other hand, a contrast‐based model with an arm‐based likelihood avoids normal approximations and continuity corrections and uses, for instance, a Binomial likelihood for datasets with dichotomous end points. For these reasons, an arm‐based likelihood approach is preferable , and hence, we focus on these types of models in the following.
Two of the most important challenges regarding NMA models are heterogeneity between trials and lack of consistency (inconsistency) in estimated treatment effects. Inconsistency arises when treatment effects obtained by direct evidence and indirect evidence(s) do not agree. The NMA models may be divided into 2 categories according to their approaches to inconsistency. Firstly, the loop‐inconsistency approach assumes that an inconsistency only occurs in closed loops of the network; these are represented by inconsistency random effects2 or node splitting.10 Secondly, the design‐inconsistency approach was introduced to handle issues of the loop‐inconsistency method with the presence of multiarm trials. The inconsistency parameters are treated as fixed effects by Higgins et al11 and as random effects by Jackson et al12 we refer the model using random inconsistency parameters as Jackson model. Moreover, to explain possible sources of heterogeneity and inconsistency in the network, a network meta‐regression model (an extension of an NMA model by including study‐level covariates) can be used.13
The NMA models can be treated using frequentist methods. Recently, to fit the Jackson model with a difference‐based likelihood, Jackson et al14 and Jackson et al15 proposed 2 estimation methods, which are extensions of the pairwise random‐effects meta‐analysis introduced by DerSimonian and Laird.16 Furthermore, a likelihood‐based method was introduced by Law et al17 and a Paule‐Mandel estimator suggested by Jackson et al18 to fit the Jackson model with a difference‐based likelihood. Alternatively, Bayesian inference is often used to fit NMA models. The standard way for a Bayesian inference is Markov chain Monte Carlo (MCMC), which is a simulation‐based technique. The statistical software packages WinBUGS,19OpenBUGS,20JAGS,21 and Stan22 are popular MCMC tools. However, MCMC is computationally expensive and requires the careful inspection of convergence diagnostics by the user. A deterministic approach to do Bayesian inference for latent Gaussian models (LGMs) has been proposed by Rue et al,23 the integrated nested Laplace approximations (INLA), which is a fast and accurate alternative to MCMC. Paul et al24 introduced INLA implementation of bivariate meta‐analyses of diagnostic test studies. Sauter and Held25 showed that many NMA models that are in the class of LGMs and INLA can be used as an inference technique alternative to MCMC for NMA models. They demonstrated how INLA can be applied to a NMA model with difference‐based likelihood,1 with arm‐based likelihood2 and the node‐splitting approach.10
The primary contribution of this paper is to introduce the usage of INLA for statistical inference within the Jackson model with arm‐based likelihood. Moreover, we propose a network meta‐regression model as an extension of the Jackson model. We use a common regression framework, which allows us to analyze datasets with different type of outcomes including continuous, dichotomous, and count using INLA. Another contribution to the existing literature is the introduction of an R package, nmaINLA(https://CRAN.R-project.org/package=nmaINLA), which is developed to automate INLA implementation of NMA models described in the paper and publicly available from CRAN. We demonstrate that the estimates obtained by INLA are very close to the ones by MCMC. In Section 2, we describe different NMA models including methods to deal with the inconsistency in the network. In Section 3, we discuss Bayesian inference of NMA models using INLA. The INLA implementation of NMA models is demonstrated using 3 applications in Section 4. We close with some conclusions and provide a brief discussion.
Highlights
What is already known: Bayesian inference using Markov chain Monte Carlo (MCMC) is one of the most popular approaches for fitting network meta‐analysis (NMA) models to take into account possible heterogeneity and inconsistency in the network.
What is new: As an alternative to MCMC, integrated nested Laplace approximations (INLA) can be used to make inference for widely used NMA models including the design‐by‐treatment interaction model. INLA is faster than MCMC and does not require checking of convergence diagnostics unlike MCMC. Furthermore, a new network meta‐regression model is suggested to explain possible sources of heterogeneity and/or inconsistency.
Potential impact for RSM readers outside the authors' field: To make it more accessible to NMA‐practitioners, a publicly available R package, nmaINLA, is developed for fitting the discussed NMA models using INLA.
2 STATISTICAL MODELS FOR NMA
We use a generalized linear mixed model (GLMM) framework to describe network meta‐analysis models. In Section 2.1, we describe a fixed effect model, then a consistency model in Section 2.2, and continue with the design‐inconsistency in Section 2.3. We propose a novel network meta‐regression model in Section 2.4.
2.1 Fixed effect model
The models described here follow the ones described in Dias et al26 and Dias and Ades.8 To model datasets with different type of end points, we describe a common regression framework. The essential idea is that the basic model remains the same, but the likelihood and the link function can change to reflect the nature of the data (continuous, dichotomous, or count), and the sampling process that generated it (eg, Normal, Binomial, or Poisson). As a special case, a pairwise meta‐analysis is just a network meta‐analysis with only 2 treatments included in the network.
, where y is the observed data and
is the relative treatment effect parameter of interest of arm tk of study i. A link function G(.) is used to transform
onto a scale where its effects can be assumed to be additive
(1)
is the absolute treatment effect of baseline treatment (t1) in trial i and it is treated as a nuisance parameter. Hereafter, subscript t1 is dropped from
, since it is redundant. The main interest is in
, which is the relative treatment effect between t1 and tk.
and
are basic parameters in the network, then a functional parameter
can be calculated using following equation
(2)Now, we consider the models for datasets with a dichotomous outcome. Assume that the number of events
and the number of patients
are given in treatment arm tk of trial i. Then the likelihood function can be written as
and a logit link function is used to define the treatment effect parameter
in Equation 1).
Likewise, the NMA models for continuous outcome data can be formulated using a normal likelihood with identity link function.26 When the data available for the NMA are counts, a Poisson likelihood with log link can be used.26 For simplicity, from this point on, we exclusively consider models for dichotomous outcomes.
2.2 Consistency model
's, are added to Equation 1 and it is assumed that the
's are independently (and normally) distributed with mean zero and some variance (heterogeneity variance). However, treatment comparisons in a multiarm trial are not independent, since all nonbaseline treatments are compared to the same baseline treatment. To illustrate this situation, consider a 3‐arm trial i including treatments t1, t2, and t3. To account for this dependency within trial i, a multivariate normal distribution with mean zero vector is used for the random effects vector
. A simple but a convenient structure of the covariance matrix of the multivariate normal distribution is suggested by Higgins and Whitehead.27 In general, we model heterogeneity as
(3)
(4)We refer to this model as consistency model, since it assumes that there is no inconsistency in the network.
2.3 Design‐inconsistency model
The consistency assumption may not hold up if there is discrepancy between evidence coming from direct estimates and indirect estimates. To take into account the possible inconsistency in the network, the Lu‐Ades model2 or design‐inconsistency approaches can be used. The Lu‐Ades model adds inconsistency parameters to closed loops where loop inconsistency may arise. However, Higgins et al11 showed that the estimates of the Lu‐Ades model depend on treatment ordering. Also, Jackson et al28 have shown that the Lu‐Ades model is a restricted version of the design‐inconsistency model. Therefore, we exclusively consider the design‐inconsistency model in this paper. The design‐by‐treatment interaction model for inconsistency was introduced in Higgins et al.11 The central concept of this approach is the design, which refers to the sets of treatments included in a particular study. We use
to denote the design of trial i. For example, if the first design compares treatments 1 and 2, then D=1 refers to 2 arm‐trials, which compare treatments 1 and 2. Design inconsistency means differences in treatment effects between designs. Higgins et al11 treated design‐inconsistency parameters as fixed effects, whereas Jackson et al12 treated them as random effects; hereafter, we use the term Jackson model for the latter. The advantage of treating inconsistency parameters as fixed effects is that no common distribution assumption is needed as in the Jackson model.11 On the other hand, the Jackson model introduces inconsistency parameters as random effects. Hence, inconsistency is treated as an additional source of variation alongside heterogeneity as in the consistency model. The Jackson model also facilitates the sensitivity analyses in terms of a single sensitivity parameter (the inconsistency variance), and more importantly, we can “estimate average treatment effects across all designs, rather than the design‐specific treatments effects we obtain when using fixed effects for the inconsistency parameters .”12 In this section, we only discuss the Jackson model.
to
where
is a design‐specific inconsistency random effect. Hence,
is added to Equation 4 resulting in
(5)
(6)
describes where particular inconsistencies arise. Note that inconsistency random effects are defined based on the data at hand, and they should be specified after the set of the designs, D(i)′s, of the network is determined. To illustrate how to choose the inconsistency parameters, consider a simple NMA dataset that includes one 3‐arm trial and two 2‐arm trials. The 3‐arm trial includes treatments 1, 2, 3; and one 2‐arm trial includes treatments 1 and 2, and the other trial includes 1 and 3. In this example, there are 3 different designs (D=1,2,3) and 4 different inconsistency random effects, namely,
, and
.
To parametrize the network, any T−1 treatment comparisons can be chosen as basic parameters as in the fixed effect and consistency model. However, for the implementation of the Jackson model (as well as fixed effect and consistency models), as described in Law et al,17 we determine a reference treatment, treatment 1, and then choose relative treatment effects compared to the reference treatment as basic parameters
's where tk≠1. This is done only to make implementation and interpretation easier, since the Jackson model is invariant to the choice of basic parameters.
2.4 Network meta‐regression
The exploration of covariate‐by‐treatment interactions in an NMA context or network meta‐regression13 is used to explain potential sources of heterogeneity in the network. These models can be constructed by extending the consistency or the fixed effect model with the inclusion of study‐level covariates, say xi's. Therefore, an NMA model is a network meta‐regression model without covariates. On the other hand, an investigation of covariates to explain inconsistency in the network may also be of interest. To achieve this, some relevant covariates can be incorporated to Jackson model. As a result, if the inconsistency variance is substantially decreased, then we may conclude that the included covariate explained the reduced amount of the total inconsistency in the network.12 We therefore propose a network meta‐regression model to achieve this. On the other hand, even if we only include randomized controlled trials for our analysis, meta‐regression (pairwise or network) inherits the challenges attached to all observational studies, for example, confounding. In other words, we may obtain a correlation between a covariate and a relative treatment effect; however, the correlation may not be a causation. This is because it is not possible to randomize patients to one covariate (see Thompson and Higgins,29, section 3 for further discussion of limitations of meta‐regression).
for any tk≠1).30 That means, the treatment effects relative to the reference treatment, d12,d13,…,d1T, now become d12+xi·β,d13+xi·β,…,d1T+xi·β. In the case of continuous covariates, we use centered covariate values
because this is computationally more stable. To fit the proposed network meta‐regression model, Equation 5 becomes
(7)
(8)This suggests that the choice of the reference treatment is important, and it affects the interpretation of the results. Therefore, we need to determine a reference treatment (say treatment 1), and basic parameters as
's where tk≠1. Otherwise, we do not obtain any meaningful interpretation from the results of the fitted model.
3 BAYESIAN INFERENCE FOR FITTING NETWORK META‐ANALYSIS MODELS USING INLA
- Stage 1: Assume that N is the total number of arms of all trials, μ is the vector of all baseline treatment effects, db is the vector of the basic parameters, and β is the constant interaction coefficient. Also, the random effects vector γ contains all trial‐specific heterogeneity random effects. Likewise, ω contains all inconsistency random effects,
for the Jackson model. The observed data y={y1,…,yN} is described by the likelihood
where α=(μ,db,β,γ,ω) includes all model parameters, and hyperparameters are denoted by Ψ=(Ψ1=τ2,Ψ2=κ2).
(9) - Stage 2: It is assumed that random effects γ and ω are both normally distributed (Equations 3 and 6). Also, if we assume normal priors for all elements of μ, db, and for β, then the joint distribution for α has a multivariate normal distribution
, which is called the latent Gaussian field.
- Stage 3: Lastly, priors are defined for the hyperparameters Ψ=(Ψ1=τ2,Ψ2=κ2). Note that normal as well as nonnormal priors can be selected for the hyperparameters.

(10)
(11)
and
. For
, we can write

is the Laplace approximation of p(α|Ψ,y) and α∗(Ψ) is the mode for a given Ψ. The calculation of
is performed using the simplified Laplace approximation, which is based on a Taylor expansion of the Laplace approximation around mode.34
(12)
and the Hessian at the mode are located. Then some relevant points in Ψ‐space (a q‐dimensional space where q is the number of hyperparameters) are selected for performing second‐order approximation (see Rue et al23, section 6.5 for details). Therefore, with this strategy, instead of laying out a dense grid of integration points , for example, using points with equal weights, only a limited number of well‐chosen points are used. Lastly, marginal posterior densities for p(Ψk|y) can be obtained similarly from p(Ψ|y).35The INLA R package, hereafter referred as R‐INLA, provides an interface for R to INLA (a free‐standing program) so that models can be fitted using standard R commands. Additional to posterior marginals, R‐INLA also provides estimates of the deviance information criterion (DIC),36 and the Watanabe‐Akaike information criterion.37 The R‐INLA package is available on INLA website (http://www.r-inla.org/). The use of R‐INLA to fit different NMA models including Lu‐Ades model is explained in Sauter and Held25 and the accompanying Supporting Information. However, for the practitioner carrying out an NMA, the range of options and the required knowledge of available features in R‐INLA might be overwhelming. Fortunately, the data preparation and postprocessing steps can be automated. To this end, we present a new R package nmaINLA, which is a purpose‐built package defined on top of R‐INLA extracting only the features needed for network meta‐analysis. Our package nmaINLA(https://CRAN.R-project.org/package=nmaINLA) implements all NMA models described in the text. nmaINLA extracts the features needed for NMA models from R‐INLA and presents in an intuitive way. Therefore, users do not need to know the structure of the general R‐INLA output object. A tutorial for the installation and how to use the nmaINLA package is given in the Supporting Information. The development version of nmaINLA is available on Github (https://github.com/gunhanb/nmaINLA). For the NMA models that are not supported by nmaINLA, one may extend our package or use directly the R‐INLA.
We compare the results obtained using the INLA approach with MCMC. For the MCMC method, we use JAGS21 from within R with the help of R2jags38 R package. Raw data, R code, and JAGS code to reproduce all results of this paper are presented in the Supporting Information. All analyses were run on a laptop with Intel(R) Core(TM) 4 Duo i3‐6100U processor 2.30 GHz.
4 APPLICATIONS
In this section, we illustrate INLA technique using 3 different applications. In Section 4.1, an NMA dataset in Diabetes is considered as an example to evaluate the relative effect on HbA1c change of adding different oral glucose‐lowering agents. In Section 4.2, we analyzed an NMA dataset comparing different interventions to aid smoking cessation. Lastly, a dataset is considered to compare number of treatments to prevent stroke in patients with atrial fibrillation in Section 4.3. For the prior specifications, we use independent normal priors with mean zero and variance 1000 for all components of α in all 3 applications. For the hyperparameters τ and κ, uniform priors on the interval [0, 5] were used in the first and second applications as in Jackson et al.12 In the third application, we used uniform priors on the interval [0, 2] for hyperparameters τ and κ, which we take from Batson et al.39 For implementations in JAGS, we used the BUGS code from the code given in Dias et al,26 Jackson et al,12 and Dias et al13 in Sections 4.1, 4.2, and 4.3, respectively. Both for the Diabetes and Smoking applications, after burn‐in of 30 000 iterations, 20 000 iterations in the fixed effect and consistency models and 50 000 iterations in the Jackson model were used to obtain posterior distributions. For the Stroke prevention application, after burn‐in of 50 000 iterations, 50 000 additional iterations were used in the fixed effect and consistency models (with and without covariate), and 100 000 additional iterations were used for the Jackson models (with and without covariate). For all 3 applications, 3 MCMC chains were used with 5 as the thinning parameter. The number of iterations was chosen to ensure that all Monte Carlo standard errors were around 0.005. Convergence diagnostics was checked using JAGS implementation of Gelman‐Rubin statistics,40 as well as visual inspection of traceplots and autocorrelation plots. We used DIC as a model comparison criterion, which is available from R‐INLA.
4.1 Diabetes: NMA with continuous endpoints

Figure 2 shows the posterior median and the 95% equi‐tailed credible interval (CI) obtained by INLA and by MCMC for all basic parameter estimates of the 3 fitted models. The estimates of heterogeneity and inconsistency standard deviations are displayed in Table 1. Individual inconsistency random effects are displayed in Table 2. Senn et al41 have fitted fixed effect and consistency models using frequentist methods to analyze the Diabetes data. Our results are in broad agreement with their results (see figures 5 and 7 in Senn et al41).

| Consistency | Jackson | |||
|---|---|---|---|---|
| MCMC | INLA | MCMC | INLA | |
| Heterogeneity (τ) | ||||
| Posterior median | 0.336 | 0.335 | 0.339 | 0.339 |
| Lower b (95% CI) | 0.217 | 0.218 | 0.216 | 0.216 |
| Upper b (95% CI) | 0.531 | 0.531 | 0.548 | 0.547 |
| Inconsistency (κ) | ||||
| Posterior median | 0.122 | 0.122 | ||
| Lower b (95% CI) | 0.006 | 0.007 | ||
| Upper b (95% CI) | 0.480 | 0.488 | ||
- Abbreviations: INLA, integrated nested Laplace approximations; MCMC, Markov chain Monte Carlo.
| MCMC | INLA | ||||
|---|---|---|---|---|---|
| Design | Parameter | Mean | Stdev | Mean | Stdev |
| 1 |
![]() |
−0.01 | 0.16 | −0.01 | 0.15 |
| 2 |
![]() |
−0.01 | 0.18 | −0.01 | 0.17 |
![]() |
−0.00 | 0.18 | −0.00 | 0.18 | |
| 3 |
![]() |
0.04 | 0.16 | 0.04 | 0.16 |
| 4 |
![]() |
−0.02 | 0.17 | −0.02 | 0.17 |
| 5 |
![]() |
0.03 | 0.18 | 0.03 | 0.17 |
| 6 |
![]() |
−0.01 | 0.17 | −0.00 | 0.17 |
| 7 |
![]() |
0.00 | 0.17 | −0.00 | 0.16 |
| 8 |
![]() |
0.06 | 0.19 | 0.06 | 0.18 |
| 9 |
![]() |
−0.00 | 0.18 | −0.00 | 0.17 |
| 10 |
![]() |
0.01 | 0.18 | 0.01 | 0.17 |
| 11 |
![]() |
−0.00 | 0.20 | −0.00 | 0.19 |
| 12 |
![]() |
−0.00 | 0.20 | −0.00 | 0.19 |
| 13 |
![]() |
−0.06 | 0.19 | −0.05 | 0.18 |
| 14 |
![]() |
0.00 | 0.20 | −0.00 | 0.19 |
| 15 |
![]() |
0.02 | 0.17 | 0.02 | 0.17 |
| 16 |
![]() |
0.00 | 0.20 | −0.00 | 0.19 |
- Abbreviations: INLA, integrated nested Laplace approximations; MCMC, Markov chain Monte Carlo.
The median and 95% equi‐tailed CI of the heterogeneity from Table 1 suggesting a substantial heterogeneity in the network. However, the estimates of the inconsistency are very close to zero, and also the individual inconsistency parameters from Table 2 are almost zero with high standard deviations. Therefore, we can conclude that there is no evidence of substantial inconsistency in the network. The DIC values of the fixed effect model, the consistency model, and the Jackson model are 36.86, −28.82, and −28.27, respectively. The consistency model offers a large improvement in DIC compared to the fixed effect model, which confirms possible presence of the heterogeneity. The DIC value of the Jackson model is very close to the DIC value of the consistency model. However, as displayed in Figure 2, including inconsistency random effects has considerable impact on the credible intervals (hence, the precisions) of the basic parameters. Therefore, although there is not strong evidence of any inconsistency in this network, it has quite considerable impact when it is included in the model.
The MCMC and INLA methods gave very similar results. If we consider all 3 models, the largest absolute difference for the posterior median estimate based on MCMC and INLA among basic parameters was found in the Jackson model for d18(0.0059). Furthermore, the largest absolute difference of the INLA and MCMC posterior mean estimates of individual inconsistency random effects in the Jackson model was 0.0032. For the Jackson model, the MCMC run took 30 seconds while INLA only took 4.9 seconds.
4.2 Smoking cessation: NMA with dichotomous endpoints
The second application includes 24 trials investigating interventions to aid smoking cessation and has been considered by Jackson et al12 and Sauter and Held25 among others. The number of individuals who successfully quits smoking after 6 to 12 months is reported for 4 different interventions (1 , no contact; 2 , self‐help; 3 , individual counseling; and 4 , group counseling). The plot of the network is displayed in Figure 3. There are two 3‐arm trials, one for treatments 1, 3, and 4 and one for treatments 2, 3, and 4. And there are 8 different designs in the network.

Figure 4 shows the posterior median and the 95% equi‐tailed CI obtained by INLA and by MCMC for basic parameter estimates of the consistency and the Jackson model. Furthermore, the marginal posterior densities from the Jackson model are displayed in Figure 5 as histograms of the MCMC samples and as solid lines of the INLA results. Finally, Table 3 demonstrates the estimates of inconsistency random effects obtained by MCMC and INLA. Jackson et al12 presented the fitted consistency and Jackson model using MCMC for the Smoking dataset. We obtained very similar results with the results displayed in table 3 of Jackson et al.12


| MCMC | INLA | ||||
|---|---|---|---|---|---|
| Design | Parameter | Mean | Stdev | Mean | Stdev |
| 1 |
![]() |
0.03 | 0.54 | 0.02 | 0.53 |
![]() |
−0.26 | 0.63 | −0.28 | 0.64 | |
| 2 |
![]() |
−0.06 | 0.54 | −0.07 | 0.55 |
![]() |
−0.09 | 0.54 | −0.10 | 0.55 | |
| 3 |
![]() |
−0.08 | 0.50 | −0.10 | 0.50 |
| 4 |
![]() |
−0.12 | 0.56 | −0.13 | 0.55 |
| 5 |
![]() |
0.39 | 0.77 | 0.39 | 0.76 |
| 6 |
![]() |
−0.10 | 0.54 | −0.11 | 0.55 |
| 7 |
![]() |
0.10 | 0.55 | 0.09 | 0.55 |
| 8 |
![]() |
−0.04 | 0.51 | −0.03 | 0.50 |
- Abbreviations: INLA, integrated nested Laplace approximations; MCMC, Markov chain Monte Carlo.
The posterior median for the heterogeneity standard deviation is 0.82 with 95% CI [0.55, 1.3] suggesting that there may be notable heterogeneity in the network. The posterior median for the inconsistency standard deviation is 0.4 with 95% CI [0.02, 1.87], suggesting moderate but highly uncertain inconsistency. Moreover, when we examine the inconsistency random effects in Table 3, it is hard to claim that there is strong evidence for the inconsistency in this network, since standard deviations are very wide. The DIC values of the consistency model, and the Jackson model are 326.56 and 326.62, respectively. Since the DIC values are almost indistinguishable, we may conclude that no strong inconsistency in the network. On the other hand, as in the Diabetes application, including inconsistency parameters to the consistency model influences the precision of the basic parameters, which can be seen from Figure 4. This observation was also made by Jackson et al.12
We can conclude that both inference techniques, MCMC and INLA, give similar results in our analysis. Based on MCMC and INLA of the fitted Jackson model, the largest absolute difference for the posterior median estimate of the basic parameters was 0.0035 and for the posterior mean estimate of the inconsistency parameters was 0.017
. For the Jackson model, the MCMC run took 34.2 seconds while the computing time was 6.5 seconds with INLA.
4.3 Stroke prevention: network meta‐regression with dichotomous endpoints
Data have been collected and analyzed by Batson et al,39 and the raw data are given in the supporting information of their paper. Stroke data include 19 studies that compare 15 different treatments to prevent stroke in patients with atrial fibrillation (AF). Treatments include fixed low‐dose warfarin with or without aspirin, aspirin monotherapy, aspirin plus clopidogrel, indobufen, idraparinux, triflusal, and ximelagatran. The corresponding network plot is given in Figure 6. The primary outcome was the number of patients who had stroke events, a dichotomous end point. The study‐level covariate of mean age is available. We fit a network meta‐regression model as described in Section 2.4 using both MCMC and INLA. Placebo was chosen to be the reference treatment. Note that since one study does not have the mean age information, the corresponding network meta‐regression model reduced in size by one (hence, 18 studies). In the network meta‐regression models, the interaction coefficient (β) is common for all treatment versus placebo. Table 4 displays the results of the fitted consistency and Jackson model with no covariate and with covariate information (mean age) using MCMC and INLA. Moreover, individual inconsistency random effects of the Jackson model without the covariate information are displayed in Table 5. Our results are in broad agreement with Figure 2 and Table 1 of Batson et al.39

| No covariate | Covariate (age) | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Consistency | Jackson | Consistency | Jackson | |||||||||
| Median | 2.5% | 97.5% | Median | 2.5% | 97.5% | Median | 2.5% | 97.5% | Median | 2.5% | 97.5% | |
| MCMC | ||||||||||||
| β | 0.02 | −0.37 | 0.38 | 0.01 | −0.37 | 0.37 | ||||||
| d1,2 | −1.23 | −1.80 | −0.72 | −1.21 | −3.10 | 0.78 | −1.19 | −2.41 | −0.11 | −1.20 | −3.43 | 1.10 |
| d1,3 | −1.50 | −2.50 | −0.59 | −1.48 | −4.17 | 1.32 | −1.44 | −3.17 | 0.18 | −1.46 | −4.73 | 1.91 |
| d1,4 | −1.13 | −2.10 | −0.21 | −1.10 | −3.83 | 1.79 | −1.07 | −2.77 | 0.46 | −1.06 | −4.31 | 2.05 |
| d1,5 | −0.68 | −1.50 | 0.18 | −0.58 | −2.86 | 2.10 | −0.30 | −2.13 | 1.26 | −0.25 | −3.02 | 2.77 |
| d1,6 | −0.11 | −1.31 | 1.06 | 0.06 | −2.05 | 3.15 | 0.03 | −1.71 | 1.84 | 0.25 | −2.34 | 3.60 |
| d1,7 | −0.61 | −1.85 | 0.60 | −0.64 | −3.50 | 2.40 | −0.55 | −2.44 | 1.13 | −0.58 | −3.75 | 2.61 |
| d1,8 | −0.47 | −1.45 | 0.49 | −0.40 | −3.18 | 2.50 | −0.41 | −2.12 | 1.16 | −0.41 | −3.65 | 2.83 |
| INLA | ||||||||||||
| β | 0.01 | −0.35 | 0.36 | 0.01 | −0.36 | 0.37 | ||||||
| d1,2 | −1.21 | −1.79 | −0.70 | −1.22 | −2.84 | 0.41 | −1.19 | −2.37 | −0.17 | −1.20 | −3.34 | 0.90 |
| d1,3 | −1.48 | −2.50 | −0.55 | −1.49 | −3.86 | 0.89 | −1.46 | −3.13 | 0.04 | −1.47 | −4.52 | 1.55 |
| d1,4 | −1.10 | −2.12 | −0.17 | −1.11 | −3.48 | 1.26 | −1.08 | −2.75 | 0.41 | −1.09 | −4.14 | 1.92 |
| d1,5 | −0.67 | −1.52 | 0.24 | −0.59 | −2.53 | 1.59 | −0.31 | −1.97 | 1.30 | −0.28 | −2.96 | 2.49 |
| d1,6 | −0.10 | −1.29 | 1.14 | 0.03 | −1.99 | 2.50 | 0.06 | −1.54 | 1.81 | 0.21 | −2.35 | 3.29 |
| d1,7 | −0.63 | −1.85 | 0.58 | −0.62 | −3.07 | 1.85 | −0.60 | −2.38 | 1.07 | −0.60 | −3.71 | 2.49 |
| d1,8 | −0.42 | −1.47 | 0.55 | −0.43 | −2.81 | 1.96 | −0.40 | −2.08 | 1.12 | −0.41 | −3.47 | 2.62 |
| MCMC | ||||||||||||
| d1,9 | −1.60 | −2.74 | −0.49 | −1.55 | −4.12 | 1.38 | −1.55 | −3.35 | 0.15 | −1.55 | −4.95 | 1.92 |
| d1,10 | −1.32 | −2.16 | −0.53 | −1.28 | −4.04 | 1.51 | −1.27 | −2.85 | 0.08 | −1.29 | −4.61 | 1.74 |
| d1,11 | −1.25 | −2.21 | −0.34 | −1.22 | −4.11 | 1.92 | −1.20 | −2.87 | 0.38 | −1.21 | −4.57 | 1.96 |
| d1,12 | −1.28 | −2.27 | −0.40 | −1.26 | −3.90 | 1.77 | −1.22 | −2.93 | 0.39 | −1.24 | −4.55 | 1.95 |
| d1,13 | −0.88 | −1.86 | 0.05 | −0.85 | −3.44 | 2.04 | −0.82 | −2.55 | 0.77 | −0.83 | −4.03 | 2.40 |
| d1,14 | −1.24 | −2.21 | −0.36 | −1.21 | −3.84 | 1.69 | −1.18 | −2.92 | 0.38 | −1.20 | −4.37 | 1.95 |
| d1,15 | 0.15 | −0.78 | 1.06 | 0.18 | −2.10 | 2.78 | 0.26 | −1.40 | 1.73 | 0.31 | −2.40 | 3.27 |
| τ | 0.23 | 0.01 | 0.84 | 0.27 | 0.01 | 0.96 | 0.37 | 0.02 | 1.30 | 0.38 | 0.02 | 1.31 |
| κ | 0.58 | 0.02 | 1.86 | 0.61 | 0.02 | 1.89 | ||||||
| INLA | ||||||||||||
| d1,9 | −1.57 | −2.76 | −0.44 | −1.57 | −4.01 | 0.86 | −1.55 | −3.32 | 0.07 | −1.56 | −4.66 | 1.51 |
| d1,10 | −1.28 | −2.15 | −0.49 | −1.29 | −3.61 | 1.04 | −1.26 | −2.73 | 0.04 | −1.27 | −4.22 | 1.65 |
| d1,11 | −1.23 | −2.24 | −0.30 | −1.23 | −3.60 | 1.14 | −1.20 | −2.87 | 0.29 | −1.21 | −4.26 | 1.80 |
| d1,12 | −1.26 | −2.28 | −0.33 | −1.26 | −3.64 | 1.11 | −1.24 | −2.90 | 0.26 | −1.25 | −4.30 | 1.77 |
| d1,13 | −0.85 | −1.86 | 0.07 | −0.86 | −3.22 | 1.51 | −0.83 | −2.49 | 0.66 | −0.84 | −3.88 | 2.17 |
| d1,14 | −1.21 | −2.22 | −0.29 | −1.21 | −3.58 | 1.15 | −1.19 | −2.85 | 0.30 | −1.20 | −4.24 | 1.82 |
| d1,15 | 0.16 | −0.79 | 1.10 | 0.19 | −1.82 | 2.32 | 0.26 | −1.23 | 1.69 | 0.29 | −2.30 | 2.98 |
| τ | 0.26 | 0.02 | 0.87 | 0.27 | 0.02 | 0.92 | 0.35 | 0.03 | 1.21 | 0.38 | 0.03 | 1.29 |
| κ | 0.55 | 0.03 | 1.61 | 0.65 | 0.03 | 1.85 | ||||||
- Note: The first line shows the estimate for the interaction coefficient (β). INLA, integrated nested Laplace approximations; MCMC, Markov chain Monte Carlo.
| MCMC | INLA | ||||
|---|---|---|---|---|---|
| Design | Parameter | Mean | Stdev | Mean | Stdev |
| 1 |
![]() |
−0.05 | 0.86 | −0.01 | 0.70 |
| 2 |
![]() |
0.02 | 0.87 | −0.00 | 0.70 |
![]() |
0.00 | 0.87 | −0.00 | 0.70 | |
| 3 |
![]() |
−0.06 | 0.67 | −0.03 | 0.57 |
| 4 |
![]() |
−0.38 | 0.82 | −0.00 | 0.70 |
![]() |
−0.14 | 0.69 | −0.11 | 0.58 | |
![]() |
−0.17 | 0.69 | −0.15 | 0.56 | |
| 5 |
![]() |
0.01 | 0.85 | −0.00 | 0.70 |
| 6 |
![]() |
0.42 | 0.83 | 0.34 | 0.68 |
| 7 |
![]() |
−0.01 | 0.86 | −0.00 | 0.70 |
| 8 |
![]() |
−0.04 | 0.87 | −0.00 | 0.70 |
| 9 |
![]() |
0.03 | 0.67 | 0.01 | 0.55 |
| 10 |
![]() |
−0.01 | 0.84 | −0.00 | 0.70 |
| 11 |
![]() |
0.00 | 0.90 | −0.00 | 0.70 |
| 12 |
![]() |
−0.01 | 0.86 | −0.00 | 0.70 |
| 13 |
![]() |
−0.01 | 0.89 | −0.00 | 0.70 |
![]() |
−0.01 | 0.87 | −0.00 | 0.70 | |
- Abbreviations: INLA, integrated nested Laplace approximations; MCMC, Markov chain Monte Carlo.
From the Jackson model using INLA, the posterior median of the heterogeneity standard deviation is 0.27 with 95% CI [0.02, 0.92] suggesting moderate heterogeneity in the network. The posterior median of the inconsistency standard deviation is 0.55 with 95% CI [0.03, 1.61], suggesting there may be notable inconsistency with high uncertainty in the network. The DIC values of the consistency model and the Jackson model without covariates are 283.05 and 283.72, respectively, which shows that adding inconsistency parameters does not result any improvement in DIC. From the results of individual inconsistency random effects, only
and
are relatively large but with very wide standard deviations. On the other hand, the addition of a covariate to the consistency model and the Jackson model, actually, increase the estimates of both heterogeneity and inconsistency standard deviations (τ and κ). The DIC values of the consistency model and the Jackson model with covariates are 271.31 and 272.05, respectively. To compare the models with covariate and without covariate information, we calculate the DIC values of the consistency model and the Jackson model without covariate when we drop the study, which does not have mean‐age information, and the results are 269.78 and 270.58, respectively. This suggests that adding covariates does not offer any notable improvement in the DIC values. Furthermore, from the results of the Jackson model with covariate, the posterior median estimate of β is 0.01 with 95% CI [−0.36, 0.37]. Therefore, we can conclude that the inclusion of mean‐age covariate to the model fails to explain the source of heterogeneity and/or the inconsistency in the network.
Both methods MCMC and INLA gave similar results. Approximately, the MCMC run took 27.1 seconds, while INLA took only 5.4 seconds for the Jackson model with covariate.
5 DISCUSSION
We have presented an approximate Bayesian inference technique, INLA, to fit various contrast‐based NMA models with arm‐based likelihood including the Jackson model as well as their network meta‐regression extensions. Furthermore, to make it more accessible for researchers, we provide an R package, nmaINLA, which automates INLA implementation of NMA models. There are good reasons to prefer INLA to MCMC. Firstly, INLA has better time performance. Secondly, there is no need to check any MCMC convergence diagnostics. Actually, this is very crucial for a large network, since the number of parameters to check diagnostics is increasing dramatically.
There is an ongoing debate about merits of the contrast‐based (CB) models and the arm‐based (AB) models. Relative treatment effect are assumed to be exchangeable across trials in the CB approach, whereas AB approach assumes that absolute treatment effects are exchangeable.8 The supporters of CB approach claim that “arm‐based pooling effectively breaks randomization, and in fact runs against the entire way in which randomized controlled trials are designed, analysed, and used .”8 AB modelers respond that “although AB models require different assumptions than CB models, it is not obvious that they are less reasonable, and the payoffs they can provide (significantly increased modeling flexibility, as well as greater ease of interpretation, prior specification, and model fitting) can be substantial.”9 For our concern, AB models are also in the family of LGMs. Therefore, it is certainly possible to use INLA to fit AB models, although our package nmaINLA does not support AB models, yet. Alternative to the Jackson model, the node‐splitting method10 is another method to detect network inconsistencies. Although we have not discussed this method and not implemented it in nmaINLA; INLA of course could be used. The explanations and the necessary R‐code are presented in Sauter and Held.25
One may find it restrictive to assume that heterogeneity and inconsistency random effects are normally distributed, hence explore different distributions for this assumption, for instance, t distribution.42 Although this modeling approach is not in the scope of latent Gaussian models, INLA still can be used as an inference tool for such models.43
Unfortunately, there is no analytical expression for the approximation error obtained by INLA. A simple way to investigate its accuracy is comparison with long MCMC runs. The accuracy of INLA for fitting GLMMs has been investigated in rich simulation studies by Fong et al31 and Grilli et al.44 They reported INLA works very well in most cases, but in some extreme cases, of binary GLMMs with few or zero events per variable, INLA exhibits some inaccuracy. Moreover, for the special situation when a covariate (almost) perfectly predicts the response (quasicomplete separation) in binary response GLMMs, Sauter and Held45 have shown that the approximation error by INLA is substantial. Ferkingstad and Rue46 introduced a copula‐based correction, which significantly increase the accuracy of INLA for such extreme cases of GLMMs, and it is already implemented in R‐INLA. As a matter of fact, in the case of such “sparse data” situations47 of binary GLMM, it is known that both maximum likelihood methods and Bayesian inference with vague priors may result serious bias away from the null.48 Hence, different penalization techniques of maximum likelihood estimates (MLE) or using weakly informative priors for Bayesian inference have been advocated to avoid such biases.49 Such problems may occur in the NMA context as well, especially when the model is a binary GLMM. Therefore, network meta‐analyzer should be cautious not to obtain biased results regardless of his/her inference tool (MLE, MCMC, or INLA).
Using vague priors for hyperparameters of NMA models may make it extremely hard to identify these parameters. This can be overcome by using more informative priors. Using predictive distributions as priors for hyperparameters to fit the Jackson model is proposed by Law et al.17 On the other hand, Simpson et al50 has been introduced a principled and broad framework to construct prior distribution for a large class of hierarchical models. The priors that they develop, PC priors, are implemented in R‐INLA; hence, they can be used in a NMA context, especially for constructing priors for hyperparameters. Moreover, checking sensitivity of heterogeneity and inconsistency parameters to the chosen prior distributions may be particularly useful for NMA models. Although we did not discuss any sensitivity analysis, it can be easily conducted due to the computational speed of INLA.51 We note that the standard ranking of treatments as discussed in Lu and Ades2 is not possible using INLA. Although point estimates of ranking of treatments are provided, it is not possible to estimate the associated errors using INLA. This is because INLA is computing marginal posteriors but not joint posteriors. On the other hand, the standard ranking may be misleading since it does not take other evidences into account.52
ACKNOWLEDGMENTS
We thank Rafael Sauter who wrote the first version of the R package nmaINLA, Sarah Batson who kindly answered our questions regarding the Stroke application discussed in Section 4.3, and Christian Röver for carefully proofreading this manuscript. We also like to thank 2 anonymous reviewers who recommended several changes that lead to substantial improvements and an improved presentation of this paper. This research has received funding from the EU's 7th Framework Programme for research, technological development and demonstration under grant agreement number FP HEALTH 2013‐602144 with project title (acronym) “Innovative Methodology for Small Populations Research” (InSPiRe).
REFERENCES
Citing Literature
Number of times cited according to CrossRef: 11
- Svenja E. Seide, Katrin Jensen, Meinhard Kieser, A comparison of Bayesian and frequentist methods in random‐effects network meta‐analysis of binary data, Research Synthesis Methods, 10.1002/jrsm.1397, 11, 3, (363-378), (2020).
- Wenbo Yang, Xin Huang, Shangyu Wang, Hong Wang, Wei Huang, Zengwu Shao, The long-term outcome of different grafts in anterior cruciate ligament reconstruction: a network meta-analysis of randomised controlled trials, Journal of Orthopaedic Translation, 10.1016/j.jot.2020.03.008, (2020).
- Verrah A. Otiende, Thomas N. Achia, Henry G. Mwambi, Bayesian hierarchical modeling of joint spatiotemporal risk patterns for Human Immunodeficiency Virus (HIV) and Tuberculosis (TB) in Kenya, PLOS ONE, 10.1371/journal.pone.0234456, 15, 7, (e0234456), (2020).
- Burak Kürsad Günhan, Christian Röver, Tim Friede, Random‐effects meta‐analysis of few studies involving rare events, Research Synthesis Methods, 10.1002/jrsm.1370, 11, 1, (74-90), (2019).
- Suzanne C. Freeman, David Fisher, Ian R. White, Anne Auperin, James R. Carpenter, Identifying inconsistency in network meta‐analysis: Is the net heat plot a reliable method?, Statistics in Medicine, 10.1002/sim.8383, 38, 29, (5547-5564), (2019).
- Svenja E. Seide, Katrin Jensen, Meinhard Kieser, Simulation and data‐generation for random‐effects network meta‐analysis of binary outcome, Statistics in Medicine, 10.1002/sim.8193, 38, 17, (3288-3303), (2019).
- Svenja E. Seide, Christian Röver, Tim Friede, Likelihood-based random-effects meta-analysis with few studies: empirical and simulation studies, BMC Medical Research Methodology, 10.1186/s12874-018-0618-3, 19, 1, (2019).
- Y. S. Zhang, W. Y. Weng, B. C. Xie, Y. Meng, Y. H. Hao, Y. M. Liang, Z. K. Zhou, Glucagon-like peptide-1 receptor agonists and fracture risk: a network meta-analysis of randomized clinical trials, Osteoporosis International, 10.1007/s00198-018-4649-8, 29, 12, (2639-2644), (2018).
- Christian Röver, Tim Friede, Contribution to the discussion of “When should meta‐analysis avoid making hidden normality assumptions?” A Bayesian perspective, Biometrical Journal, 10.1002/bimj.201800179, 60, 6, (1068-1070), (2018).
- Tim Friede, Martin Posch, Sarah Zohar, Corinne Alberti, Norbert Benda, Emmanuelle Comets, Simon Day, Alex Dmitrienko, Alexandra Graf, Burak Kürsad Günhan, Siew Wan Hee, Frederike Lentz, Jason Madan, Frank Miller, Thomas Ondra, Michael Pearce, Christian Röver, Artemis Toumazi, Steffen Unkel, Moreno Ursino, Gernot Wassmer, Nigel Stallard, Recent advances in methodology for clinical trials in small populations: the InSPiRe project, Orphanet Journal of Rare Diseases, 10.1186/s13023-018-0919-y, 13, 1, (2018).
- Sara Martino, Andrea Riebler, Integrated Nested Laplace Approximations (INLA), Wiley StatsRef: Statistics Reference Online, 10.1002/9781118445112, (1-19), (2014).

















































