Multilevel network meta‐regression for population‐adjusted treatment comparisons

Summary Standard network meta‐analysis (NMA) and indirect comparisons combine aggregate data from multiple studies on treatments of interest, assuming that any effect modifiers are balanced across populations. Population adjustment methods relax this assumption using individual patient data from one or more studies. However, current matching‐adjusted indirect comparison and simulated treatment comparison methods are limited to pairwise indirect comparisons and cannot predict into a specified target population. Existing meta‐regression approaches incur aggregation bias. We propose a new method extending the standard NMA framework. An individual level regression model is defined, and aggregate data are fitted by integrating over the covariate distribution to form the likelihood. Motivated by the complexity of the closed form integration, we propose a general numerical approach using quasi‐Monte‐Carlo integration. Covariate correlation structures are accounted for by using copulas. Crucially for decision making, comparisons may be provided in any target population with a given covariate distribution. We illustrate the method with a network of plaque psoriasis treatments. Estimated population‐average treatment effects are similar across study populations, as differences in the distributions of effect modifiers are small. A better fit is achieved than a random effects NMA, uncertainty is substantially reduced by explaining within‐ and between‐study variation, and estimates are more interpretable.

The multivariate Normal scenario offers some insight into the relationship between ML-NMR and the naïve approach of "plugging in" mean covariate values. Equation (A.2) can be viewed as an adjustment to the naïve model, with the adjustment term involving the covariate covariance matrix Σ jk accounting for the integration over the population. Examining the adjustment term, we notice that this depends upon • The covariance of prognostic variables and effect modifiers, and the strength of each, through β T 1 Σ jk β 2,k . This is large when highly prognostic variables are correlated with strong effect modifiers, or when strong effect modifiers which are also highly prognostic take a wide range of values in the j-th study population (the population variance is large).
• The covariance of effect modifiers and their strength, through β T 2,k Σ j β 2,k . This is large when strong effect modifiers are highly correlated, or when strong effect modifiers take on a wide range of values in the j-th study population (the population variance is large).
This lends theoretical justification to intuitive thinking for when effect modification may lead to aggregation bias. In particular, aggregation bias is expected to be small if either effect modifiers are weak, or the within-study variance of effect modifiers is small. The covariance inequality bounds the covariance between two variables by their standard deviations, |cov(X 1 , X 2 )| ≤ var(X 1 ) var(X 2 ); small within-study variance in the effect modifiers is therefore sufficient to also limit the aggregation bias due to correlations.
To use the two-parameter Binomial approximation to the Poisson Binomial likelihood (see equation 4), we additionally need to evaluatep 2 jk . The derivation is similar to that forp jk , except now the integration is over the squared linear predictor: For probit regression with g −1 (·) = Φ(·), and multivariate Normal covariates, the integration is carried out as follows: where T(·, ·) is Owen's T function, which may be evaluated using fast numerical methods (e.g. quadrature). For skew covariates, evaluation of the aggregation integral (1d) is troublesome: analytic integration for both log-Normal and Gamma distributed covariates is not straightforward. In an ecological context, Salway and Wakefield (2005) noted that estimates remained biased when skew covariates were treated as Normal (although less so than the naïve approach), and uncertainty was underestimated.

A.2. Analytic integration with the logit link function
When using the logit link function g(p) = log p 1−p , as in logistic regression, the integral (1d) does not have a closed form solution. Previous authors have suggested approximating the logit link by a probit (Salway and Wakefield, 2005;Demidenko, 2004), for example One-probit approx.: The two-probit approximation is particularly robust, and still results in a proper likelihood distribution as the coefficients sum to one. Higher order approximations are possible and can be even more accurate, but do not necessarily result in proper likelihood distributions. Continuing with the one-probit approximation first, the average success probability on treatment k in the j-th study is then approximatelȳ If the covariates are multivariate-Normal, x j ∼ MVN(m j , Σ j ), then we have (Salway and Wakefield, 2005)p Derivation proceeds similarly for the two-probit approximation with multivariate-Normal covariates x j ,

. Imputing a correlation structure with copulae
We typically only have access to published marginal covariate information for the AgD trials, and therefore have no information on the correlation structure between the covariates. Rather than assuming that all correlations between covariates in the AgD are zero (which may be unreasonable), we can instead assume that the correlation structures are similar in AgD and IPD.
To account for this, we compute a correlation matrix Ω for the AgD trials from the IPD trials (or a representative subset), and impose this upon the Sobol' integration pointsũ using a Gaussian copula (Nelsen, 2006). This is equivalent to applying the inverse CDF Φ −1 Ω (·) of the multivariate Normal with correlation matrix Ω, and then the standard multivariate Normal CDF Φ(·), , to obtain correlated Sobol' pointsũ * . (In practice this is computed component-wise conditionally; we use the implementation in the R package copula (Yan, 2007).) The correlated Sobol' points are then transformed using the inverse CDF method to match the marginal covariate distributions reported in the AgD trials, where F −1 jk;l (·) is the inverse CDF of the marginal distribution assumed for the l-th covariatex jk;l on treatment k in study j. The resulting integration pointsx jk capture the correlations between the covariates (e.g. longer duration of psoriasis is correlated with having previous systemic treatment) whilst preserving the marginal distribution for each covariate.

A.4. Assessing inconsistency
ML-NMR, like standard IPD and AgD NMA, makes an assumption of consistency that is enforced through a set of consistency equations d ab = d b − d a . For ML-NMR, consistency applies to both the individual-level treatment effects, γ ab = γ b − γ a , and the effect modifier interactions, β 2,ab = β 2,b − β 2,a . The causes of inconsistency in ML-NMR are the same as the causes of heterogeneity described in section 2.4. For example, there may be effect modifiers that have not been included in the model or other model misspecification, the assumed joint covariate distributions used to adjust the results from aggregate studies may be incorrect, or the shared effect modifier assumption (if it was used) may be invalid. Attempts may be made to rectify these issues in a revised model-if data permits-and the revised model may then be assessed for inconsistency. To assess inconsistency, we can use the same approaches as standard IPD and AgD NMA-such as the unrelated mean effects model (Dias et al., 2011b), and node-splitting models (Dias et al., 2010).

Unrelated mean effects
The unrelated mean effects (UME) model (Dias et al., 2011b) treats all contrasts-both basic (i.e. against treatment 1) and functional-as independent parameters, without imposing consistency. The UME model needs to be written with the study-specific baselines referring to a reference arm in each trial (the "baseline shift" parameterisation), rather than the reference treatment 1, since the latter imposes consistency implicitly. For random effects ML-NMR (9), we must also consider the EM interaction terms, and whether or not we allow these to be inconsistent too (Donegan et al., 2017). The linear predictor and random effects structure of the UME model for RE ML-NMR are written as for a study j with treatment t 1 in arm 1, where µ (t 1 ) j is the study-specific baseline with respect to t 1 .
If we apply the consistency equations to the EM interactions, β 2,ab = β 2,b − β 2,a , then (A.9) only relaxes consistency in the treatment effects. There can also be inconsistency in the EM interactions (Donegan et al., 2017); to assess this as well, we instead place independent prior distributions on β 2,ab , such as N(0, σ 2 β 2 ) for a suitable prior variance σ 2 β 2 . However, this requires sufficient data on each contrast to estimate independent interactions. This may not be possible, for example if there are contrasts which are only informed by a small number of AgD studies.
An intermediate approach is possible when the shared EM assumption (section 2.3) is used to fit the ML-NMR model, so that the regression coefficients β 2,k for a set of treatments k ∈ T are all equal. In this case, we can use the shared EM assumption-which implies that certain interactions are zero or equal to each other-and allow the remaining interactions to be inconsistent. To achieve this, consider (without loss of generality) that every treatment is assigned to a mutually exclusive set T 1 , T 2 , . . . (some treatments may be in a set by themselves). Then, using the shared EM assumption, EM interactions for contrasts between any two treatments within a given set T are equal to zero, β 2,ab = 0 for any two treatments a, b ∈ T . EM interactions for contrasts between treatments in any two different sets T 1 , T 2 are equal, β 2,a 1 a 2 = β 2,b 1 b 2 for any treatments a 1 , b 1 ∈ T 1 and a 2 , b 2 ∈ T 2 , and are assigned a prior distribution such as N(0, σ 2 β 2 ). This allows us to assess inconsistency of the shared EM interactions, and such a model should always be identifiable when the corresponding standard (consistency) ML-NMR model with shared EM interactions is identifiable.
In any case, evidence for inconsistency is then based on comparing the model fit (e.g. using residual deviance and DIC) between the ML-NMR model assuming consistency and the UME model without consistency.

Node-splitting
The node-splitting approach for network meta-regression models (Donegan et al., 2017) is naturally applicable to ML-NMR in the same manner as IPD network meta-regression. For a given contrast b vs. a , the node-splitting model splits the estimation of the relative effect γ a b and effect modifier interactions β 2,a b into parameters estimated by direct evidence only, γ D a b and β D 2,a b , and parameters estimated by the indirect evidence from the rest of the network, γ I a b and β I 2,a b . To achieve this, the random effects ML-NMR model (9) remains the same for studies not including both a and b treatment arms. For studies including both a and b treatment arms, we choose to re-parameterise the model with a as the reference treatment within these studies. Mathematically, we write out the linear predictor and random effects for this node-splitting model as For studies without both a and b treatment arms: For studies with both a and b treatment arms: where the re-parameterised study-specific baselines with respect to treatment a are denoted by µ (a ) j , and we write the study-specific baselines with respect to treatment 1 as µ (1) j = µ j for additional clarity. As usual we set γ 1 = δ j1 = 0 and β 2,1 = 0. If there are multi-arm studies with both a and b treatment arms, then the other random effects δ j a k with k b are uncorrelated with δ j a b , but are still correlated between themselves with cor(δ j a a , δ j a b ) = 0.5 for a, b a , b (assuming homogeneous τ 2 ). The indirect estimates γ I a b and β I 2,a b are obtained from the consistency equations (A.11) The node-splitting model as written in (A.10) splits the EM interaction terms for all covariates at once. Alternatively, a separate node-splitting model could be fitted for each covariate in turn, where β D 2,a b is broken down into a split interaction for one covariate, β D 2,a b ;l , and the consistency equations are applied for the remaining covariates β 2,a b ;l = β 2,b ;l − β 2,a ;l . The latter approach may be more tractable in scenarios with smaller amounts of data on the b vs. a contrast and/or large numbers of effect modifying covariates, since there are only 2 more parameters than the standard RE ML-NMR model (γ D a b and β D 2,a b ;l ), as opposed to L + 1 more when splitting all EM interactions at once (γ D a b and β D 2,a b ), where L is the number of covariates. Furthermore, there may be insufficient data on the b vs. a contrast even to node-split the EM interaction terms one covariate at a time, for example if the direct evidence consists of only a small number of AgD studies. In this case, we may be able to assess inconsistency in the treatment contrast γ a b by node-splitting into γ D a b and γ I a b , but not in the EM interactions, keeping the consistency equations β 2,a b = β 2,b − β 2,a .
As with the unrelated mean effects model (appendix A.4), evidence for inconsistency based on comparing the model fit (e.g. using residual deviance and DIC) between the ML-NMR model with and without node-splitting, and also checking whether the heterogeneity variance τ 2 is reduced. The graphical summaries described by Donegan et al. (2017) may also be utilised.   Note that the interval for MAIC is a 95% Confidence Interval, as MAIC is a frequentist method.