Meta-analysis of few studies involving rare events

Meta-analyses of clinical trials targeting rare events face particular challenges when the data lack adequate numbers of events for all treatment arms. Especially when the number of studies is low, standard meta-analysis methods can lead to serious distortions because of such data sparsity. To overcome this, we suggest the use of weakly informative priors (WIP) for the treatment effect parameter of a Bayesian meta-analysis model, which may also be seen as a form of penalization. As a data model, we use a binomial-normal hierarchical model (BNHM) which does not require continuity corrections in case of zero counts in one or both arms. We suggest a normal prior for the log odds ratio with mean 0 and standard deviation 2.82, which is motivated (1) as a symmetric prior centred around unity and constraining the odds ratio to within a range from 1/250 to 250 with 95 % probability, and (2) as consistent with empirically observed effect estimates from a set of $\mbox{$37\,773$}$ meta-analyses from the Cochrane Database of Systematic Reviews. In a simulation study with rare events and few studies, our BNHM with a WIP outperformed a Bayesian method without a WIP and a maximum likelihood estimator in terms of smaller bias and shorter interval estimates with similar coverage. Furthermore, the methods are illustrated by a systematic review in immunosuppression of rare safety events following paediatric transplantation. A publicly available $\textbf{R}$ package, $\texttt{MetaStan}$, is developed to automate the $\textbf{Stan}$ implementation of meta-analysis models using WIPs.


Introduction
Individual clinical studies are often underpowered to detect differences in the rates of rare events, for example safety events, and thus meta-analysis may be the only way to obtain reliable evidence of treatment differences with regard to the rare events. 1 On the other hand, meta-analysis of clinical studies for rare events faces particular challenges, since the numbers of events might be very small in some treatment arms. The problem is even more pronounced when some studies have no events either in one or both treatment arms (so-called single-zero or double-zero studies).
Standard (approximate) meta-analysis methods, for example the normal-normal hierarchical model, 2 require a continuity correction in case of single-zero or double-zero studies, that is, the addition of a fixed value (typically 0.5) to all cells of the contingency table for studies with no events. Such simple approaches have been found problematic for meta-analyses involving rare events. 3 Therefore, statistical models based on exact distributional assumptions have been suggested. These include different parametrizations of the binomial-normal hierarchical model, 4 a mixed effects conditional logistic model, 5 a Poisson-normal hierarchical model, 6 a Poisson-Gamma hierarchical model, 7 or a betabinomial model. 8 In this paper, we focus on a parametrization of the binomial-normal hierarchical model (BNHM) which was suggested by Smith et al. 9 Consider an extreme case of meta-analysis of rare events, where all studies include no events for the same treatment arm. This data sparsity problem in a meta-analysis can be seen as a separation problem in the logistic regression context 10 in which case maximum likelihood estimate (MLE) for treatment effect parameter does not exist. A very useful way to deal with separation problems, or, more generally, data sparsity in logistic regression is penalization, that is, adding a penalty (adjustment) term to the original likelihood function to regularize (or stabilize) the estimates. 11 In a frequentist framework, penalty terms may be specified so that these "nudge" the MLE into a desired direction if the maximum is not or poorly defined; one such example is Firth penalization. 12,13,11 From a Bayesian viewpoint, penalization may often be motivated as weakly informative priors (WIPs) that are multiplied to the likelihood function. 14 Numbers of studies included in meta-analyses are typically small, posing additional challenges. 15 For Bayesian meta-analysis of few studies, different WIPs have been suggested for the heterogeneity parameter, see 16 for penalized MLE approach, and also see 17,18,19,20,21 for fully-Bayesian inference.
Here we consider the meta-analysis of few studies targeting rare events. To deal with data sparsity present in the meta-analysis of few studies with rare events, we suggest the use of WIPs for the treatment effect parameter in a fully-Bayesian context inspired by penalization ideas. 11, 14 We use a BNHM which is parameterized in terms of baseline risks and a treatment effect for the data. Note that this is a contrast-based model meaning that relative treatment effects are assumed to be exchangeable across trials. 22 Our suggested default WIP for the treatment effect parameter is motivated via the consideration of the prior expected range of treatment effect values. Furthermore, it is consistent with effect estimates empirically observed in a large set of meta-analyses from the Cochrane Database of Systematic Reviews (CDSR) with binary endpoints.
The main contribution of this paper is the introduction of default WIPs as penalization for treatment effect parameters to deal with data sparsity in the meta-analysis of few studies involving rare events. Another contribution is the introduction of an R package, MetaStan (https://github.com/gunhanb/MetaS which is developed to automate a Bayesian implementation of meta-analysis models using WIPs as described in the paper, and which is publicly available from Github. In Section 2, we describe a systematic review concerning rare safety events associated with immunosuppressive therapy following paediatric transplantation. In Section 3, we describe the application of weakly informative priors (WIPs) for the treatment effect parameter. We review a BNHM for meta-analysis, discuss the derivation of WIP, and an empirical investigation of treatment effect parameter estimates from the CDSR. Long-run properties of different methods including the proposed one are investigated in the simulation studies in Section 5. In Section 6, the example is revisited to illustrate the proposed method and its implementation. We close with some conclusions and provide a discussion.

An application in paediatric transplantation
Several rare paediatric liver diseases can nowadays be successfully treated by liver transplantation with good long-term outcomes. Crins et al 23 conducted a systematic review of controlled but not necessarily randomized studies of the Interleukin-2 receptor antibodies (IL-2RA) basiliximab and daclizumab in paediatric liver transplantation. Primary outcomes were acute rejections (ARs), steroid-resistant rejections (SRRs), graft-loss and death. Their analyses were based on a random effects meta-analysis using a restricted maximum likelihood approach (REML). 24 Crins et al 23 used risk ratios as effect measures, while we use the odds ratios, here. With rare events, however, these should be very similar. Heterogeneity was assessed using Cochrane's Q test. 25 Secondary outcomes included renal dysfunction and post-transplant lymphoproliferative disease (PTLD). For illustrative purposes, here we focus on death and PTLDs, and these outcomes are displayed in Table 1.
The specific problems with meta-analyses concerning rare events outlined in the introduction are prominent here. Firstly, the numbers of events are very small. For the PTLD dataset, there is one

Weakly informative priors for the treatment effect
In this section, we present the usage of weakly informative priors (WIPs) for the treatment effect parameter to conduct random effects meta-analysis of rare events with few studies. As a data model, we review a binomial-normal hierarchical model, and then show how to derive a WIP for a treatment effect parameter. Then, empirical evidence obtained from the Cochrane Database of Systematic Reviews supporting the choice of WIPs is illustrated.

Data model
The binomial-normal hierarchical model (BNHM) has been introduced by Smith et al. 9 In the BNHM, for each trial i ∈ {1, . . . , k} and treatment arm j ∈ {0, 1}, the event counts r ij are modeled using a binomial distribution, that is, r ij ∼ Bin(π ij , n ij ). The logit link is used to transform π ij onto the log odds ratio scale where effects can be assumed to be additive where x ij is a treatment indicator, namely +0.5 = experimental (j = 1) and −0.5 = control (j = 0). The µ i are fixed effects denoting the baseline risks of the event in each study i, θ is the mean treatment effect and τ is the heterogeneity in treatment effects between trials. The BNHM belongs to the family of generalized linear mixed models (GLMMs); this family also includes models for other types of data including continuous or count outcomes. It is important to note here that by treating the baseline risks µ i as fixed effects, the analysis effectively stratifies the risk by study, as pooling of risks might compromise the studies' randomization. In this sense it constitutes a contrast-based model. 22 Unlike the normal-normal hierarchical model, the BNHM does not rely on a normal approximation, since it builds on the binomial nature of the data directly.
The BNHM can be fitted using frequentist approaches, for example via maximum likelihood estimation (MLE). 4 Alternatively, Bayesian methods are commonly used. In a fully Bayesian approach, prior distributions for parameters θ, µ i and τ need to be specified. Note that the parameter θ is on the log-odds ratio scale whereas µ i are on the log-odds scale. Baseline risks (µ i ) may be seen as intercept parameters in a standard logistic regression model. For µ i , we use a vague normal prior with mean 0 and standard deviation 10, following the recommendation by Gelman et al. 14 The prior choice for θ is our main focus and will be discussed in Section 3.2. The prior choice for the heterogeneity parameter τ , which is a standard deviation parameter, has gained much attention in the literature as discussed in the introduction. Friede et al 18 have shown that for meta-analysis of few studies, the use of WIPs for τ displays desirable long-run properties in comparison to frequentist alternatives. Following their suggestions, we use a half-normal prior with scale of 0.5 (HN (0.5)) for τ which has the median of 0.337 with an upper 95% quantile of 0.98. Values for τ of 0.25, 0.5, 1, and 2 represent moderate, substantial, large, and very large heterogeneity. 29 Thus a HN (0.5) prior captures heterogeneity values typically seen in meta-analyses of heterogeneous studies and will therefore be a sensible choice in many applications.

Derivation of a weakly informative prior for the treatment effect
A common prior choice for the treatment effect parameter θ is a non-informative (vague) prior such as normal distribution with a large variance, for example N (0, 100 2 ). One way of constructing a WIP works via consideration of the prior expected range of treatment effect values. 30 Before the derivation of the WIP for treatment effect parameter θ, recall that θ is on the log odds ratio scale. Thus, a value of θ = 0 means an odds ratio of 1, i.e., "no effect", and a value of θ = 1 means that odds differ by a factor (ratio) of exp(1) = 2.7.
We assume a symmetric prior centred around zero, implying equal probabilities for positive or negative treatment effects. Symmetry then implies (on the log odds ratio scale) that where (on the odds ratio scale) The prior's scale parameter σ prior then may be set such that a priori the odds ratio is with 95% probability confined to a certain range: In case of a normal prior with standard deviation σ prior , we can then simply specify We conservatively specify δ as 250, meaning that we consider it unlikely that odds differ between treatment and control groups by a factor of more than 250. By plugging in this number into (5), we obtain σ prior = 2.82.
Another way to motivate the prior standard deviation is by using the idea of unit information priors. 31,32 When the treatment effect parameter is on the log odds ratio scale (as in the BNHM), then the standard error is given by 1 Assuming equal allocation and a neutral effect, we can simply set the table allocation to a = b = c = d = N 4 . Therefore, if we (heuristically) reverse the argument, a prior for the log odds ratio with zero mean and 2.82 standard deviation gives 32 Hence N = 2. In other words, in terms of prior's effective sample size, the choice of σ prior = 2.82 is equivalent to adding 2 patients to the dataset. From this, it follows that N (0, 2.82) is a reasonable choice as a WIP for θ. Note also the analogy between this WIP and commonly used continuity corrections: Zero entries in a contingency tables are commonly "fixed" by adding a correction term of 0.5 to each table cell of the single-zero or double-zero study, which also amounts to a total of 2 patients "added" to the data. Note that the continuity correction adds 2 patients to each single-zero or double-zero study, while the use of WIP is equivalent to adding 2 patients to the whole dataset.

Empirical evidence supporting the weakly informative prior for the treatment effect
For an empirical investigation of the WIP for treatment effect parameter, we consider the meta-analysis datasets archived in the CDSR. All systematic reviews in the CDSR are available on the Cochrane Library website, 33 and personal or institutional access is required. For downloading the data from the CDSR and converting to CSV files, we use the program Cochrane_scraper (version 1.1.0). 34 We were able to access all Cochrane systematic reviews available in March 2018 (CD000004 to CD012788). Meta-analyses were excluded if they included only one study, if the analysis was labeled as a subgroup or sensitivity analysis or there was insufficient information for classification, or if all data within the meta-analysis appeared to be erroneous. Finally, we only consider meta-analyses with dichotomous outcomes. In total 37 773 meta-analysis datasets from 4 712 reviews are included. Note that we did not distinguish regarding efficacy or safety analyses. The frequency of the number of studies k considered for each meta-analysis is illustrated in Figure  1. The percentage of the meta-analyses including 5 or less studies is 66%. This figure is consistent with other re-analyses of the CDSR (see e.g. 15,35,36 ). We re-analysed the meta-analysis datasets from the CDSR using the BNHM via a maximum likelihood estimate (MLE) approach. This procedure is implemented using the R package lme4. 37 A histogram of the estimates of θ is illustrated in Figure 2A. 2.5% and 97.5% quantiles of the estimates of θ are -1.94 and 2.06, respectively. By following Turner et al, 35 we exclude the zero heterogeneity estimates; non-zero estimates of τ are shown in Figure 2B. The fraction of non-zero heterogeneity estimates is 63%, which is also consistent with previous findings. 35 The 95% quantile of non-zero estimates of τ is 1.51, while the 95% quantile of τ estimates including zeroes is 1.05. The underlying distribution of the estimates of θ and τ and their variability are useful to see how large are these estimates in some general population, in this case the CDSR. And, thus these give us a rough sense of what would be a reasonable default prior distribution. Therefore, we suggest the use of WIPs, N (0, 2.82) for θ and HN (0.5) for τ , which are consistent with estimates of θ and τ empirically observed in the CDSR.

Implementation of the proposed procedure in R using Stan
The Bayesian implementation of the BNHM can be fitted with the probabilistic programming language Stan 38 which employs a modern Markov chain Monte Carlo (MCMC) algorithm, namely Hamiltonian Monte Carlo with the No-U-Turn Sampler. 39 It is known that the parametrization of the

Simulation study
In order to assess the long-run properties of the proposed approach and compare it to some alternatives, we conducted a simulation study.

Simulation setup
The simulation scenarios are similar to those considered by Friede et al, 18 but with the important difference that we focus on rare events. The datasets are generated under the BNHM, more specifically (1). Numbers of studies (k ∈ {2, 3, 5}) and true treatment effects (θ = {−5, −4, −3, −2, −1, −0.5, 0, 0.5, 1, 2, 3, 4, 5}) are varied, resulting in a total of 39 simulation scenarios. To reflect the rare-event cases, true baseline risks on the probability scale are taken uniformly between 0.005 and 0.05. Following Kuss, 8 a lognormal distribution is fitted to the sample sizes obtained from the CDSR data, resulting in a log-normal distribution with parameters µ = 5 and σ = 1. Hence, sample sizes are generated from LN (5, 1 2 ), the minimum sample size is restricted to 2 patients (values below 2 are rounded up to 2), and at least 1 patient in each treatment arm is assumed. The degree of heterogeneity (τ ) is taken as τ = 0.28 (moderate heterogeneity), which is the median value of the predictive distribution for between-study heterogeneity in a meta-analysis as estimated by Turner et al. 35 According to a binomial probability of 0.5, patients were allocated to the treatment groups, thus mimicking randomization. The simulations were carried out with 10 000 replications per scenario. The data sparsity is reflected in the average fraction of single-zero or double-zero studies in a simulated meta-analysis dataset which are shown in Figure 3A. Notice that the fractions of the single-zero and double-zero studies are the highest when true treatment effect is -5, and they are decreasing with the increase of the treatment effect.
The proposed approach and two comparators are included in the analysis: BNHM using a vague prior (N (0, 100 2 )) for θ ("Vague"), BNHM using a WIP (N (0, 2.82 2 )) for θ ("WIP"), and BNHM using MLE ("MLE"). For both Vague and WIP approaches, the prior for τ and µ are taken as HN (0.5), and N (0, 10), respectively. Estimation failure were assumed to be 0 for Bayesian methods, and convergence was assumed to be reached after 1 000 iterations of burn-in. 3 chains were run in parallel for a total of 2 000 iterations. with We used the package lme4 for the MLE, whereas the Vague and WIP methods were fitted with our MetaStan package. The highest density intervals were  For the MLE, the fraction of failed runs (non-convergence) is shown in Figure 3B. Estimation failure is closely related to the fraction of meta-analysis datasets including single-zero or double-zero studies in the dataset, which can be seen by comparing Figure 3A and Figure 3B. This is because when the data is highly sparse, estimation becomes more challenging for the MLE. As a performance measure, we use the bias ( 1 N N i=1 (θ − θ)) based on the MLE and posterior medians. Moreover, the coverage probability and the mean length of interval estimates for θ are reported.

Simulation results
The bias for posterior medians from the Vague and the WIP methods, and for maximum likelihood estimate from the MLE across scenarios are displayed in the first row of Figure 4. Note that failed runs were excluded from the calculation of performance measures which is relevant only for the MLE. The MLE shows unacceptably high bias for the scenarios with θ ≤ 0, corresponding to the scenarios in which fraction of zero studies is very high. The WIP displays somewhat positive bias whereas the Vague shows negative bias for the scenarios with θ ≤ 0. It is important to note that the results of the bias behave similar to the fraction of zero studies and the fraction of non-convergence of the MLE, meaning that the bias is higher in scenarios with more sparse data. Since the Vague approach uses a vague prior on θ, one would expect a somewhat similar behaviour of bias from the Vague and the MLE approaches. However, the fact the data is highly sparse and also the Vague approach includes a weakly informative prior for τ may be explanations of the better performance of the Vague method in comparison to the MLE. Moreover, performance in terms of bias is improving for all methods when the number of studies k is increasing. Figure 4 also shows coverage probabilities and log mean lengths for 95% highest density intervals (HDIs) obtained by the Vague and the WIP, and for 95% Wald confidence intervals (CIs) obtained by the MLE. The WIP method shows higher coverage than nominal level across all different true treatment effects except θ = −5. On the other hand, the HDIs obtained by WIP are shorter in comparison to HDIs obtained by the Vague and CIs obtained by the MLE approaches.
Lastly, the bias for the heterogeneity parameter τ obtained by three methods (the MLE, the Vague and the WIP) are demonstrated in Figure 5. For Bayesian methods, posterior medians are used as the point estimates. Recall that, the prior used for τ in the Vague and the WIP is a weakly informative prior (HN (0.5)). The MLE under-estimates the true heterogeneity, whereas the Vague and the WIP methods, slightly, over-estimates it. These observations are in aligned with the conclusions made by Friede et al. 18

Example revisited
Returning to the dataset described in Section 2, we consider the data on death and PTLD outcomes shown in Table 1. The observed log odds ratios are displayed in Figure 6. To be able to visualize the observed log-odds ratios when there is single-zero or double-zero study, a continuity correction of 0.5 is added to all cells of the single-zero or double-zero study's contingency table. The wide CIs for observed log-odds ratios reflect the rather small sample sizes in the datasets. Furthermore, the variability in the point estimates can be seen as an indicator of the heterogeneity between trials.
We analyse the datasets using three methods investigated in the simulation studies, namely Vague, WIP, and MLE approaches. The code to implement the MLE is shown in Appendix B. Recall that the only difference between Vague and WIP is the prior used for the treatment effect parameter θ in the model, namely N (0, 100 2 ) for the former and N (0, 2.82 2 ) for the latter. WIP can be implemented in a routine data analysis using our MetaStan package as follows bnhm.wip.CrinsPTLD.stan <-meta_stan(ntrt = dat.Crins2014.PTLD$exp.total, nctrl = dat.Crins2014.PTLD$cont.total, rtrt = dat.Crins2014.PTLD$exp.PTLD.events, rctrl = dat.Crins2014.PTLD$cont.PTLD.event, tau_prior_dist = "half-normal", tau_prior = 0.5, delta = 250) The argument delta corresponds to δ from (5), and thus is used to calculate the WIP for θ. Alternatively, one can directly specify the prior for θ, in our case, equivalently, we can have theta_prior = c(0, 2.82). The Vague method is, simply, implemented by omitting the argument delta_u and specifying theta_prior = c(0, 100). To check MCMC convergence, we use the Gelman-Rubin statistics, and traceplots. For the WIP approach, the corresponding traceplots are shown in Figure  7 and Figure 8 (Appendix A) for death and PTLD outcomes, respectively. There was no divergence reported for both datasets. The MLE fit for the dataset with death outcome does not cause any warning from lme4. For PTLD outcomes, lme4 gives a warning suggesting that the estimates may not be reliable. Nevertheless, it produces the MLE estimate and CI for treatment effect parameter, and we report them. The results from three methods are shown in Figure 6. For the MLE method, the MLE and 95% CI are given. For Vague and WIP methods, posterior medians and 95% HDIs are shown. The estimates of the between-trial heterogeneityτ are also included in the figure. The point estimates from three methods look quite similar. MLE gives shorter interval estimates compared to Bayesian alternatives which is due to the fact that the uncertainty of τ is taken into account in both Bayesian approaches. In the original paper, Crins et al 23 fitted a normal-normal hierarchical model using REML, 24 and risk ratio was used as the measure of treatment effect. They concluded that treatment IL-2RA failed to show statistically significant result for reducing death. We obtained similar point estimates with somewhat wider interval estimates to Crins et al, 23 specifically their risk ratio estimate was 0.61 (CI 0.27-1.37), and we obtained the odds ratio estimate 0.57 (HDI 0.21-1.46) using the WIP method. Concerning the PTLD, the risk ratio was estimated as 1.60 (CI 0.20-12.67) by Crins et al, 23 the odds ratio is estimated 1.99 (HDI 0.20-25.35) using the WIP method. The wider interval estimates obtained by WIP may stem from the fact that the uncertainty of τ is taken into account.
Considering death as outcomes, the heterogeneity parameter τ is estimated 0.30, 0.28, and 0.00 using WIP, Vague, and MLE, respectively. Similarly for the PTLD outcomes, forτ , we obtained 0.33, 0.33, and 0.00 using WIP, Vague, and MLE, respectively. Moreover, Crins et al 23 concluded that there is no evidence for heterogeneity between trials using using Cochrane's Q test for both death and PTLD outcomes. Since the prior used for τ is the same for WIP and Vague, similar heterogeneity estimates are expected. On the other hand, the MLE estimate (τ = 0.00) is most probably underestimating the actual amount of heterogeneity.

Conclusions and Discussion
Random effects meta-analyses are commonly based on few studies only, which often poses problems for inference, as certain asymptotics cannot be relied upon. Additional issues arise for binary outcomes when only few or no events are observed in some of the studies or study arms. To deal with such data sparsity in the meta-analysis, we have proposed the use of weakly informative priors (WIPs) for the treatment effect parameter θ in a binomial-normal hierarchical model (BNHM). We demonstrated how a normal WIP for θ can be derived by considering an a priori interval for the treatment effect on a log odds ratio scale. Also, the empirical evidence obtained from 37 773 meta-analyses with binomial outcomes from the Cochrane Database of Systematic Reviews supports the proposed WIP. In simulation studies, the suggested method displays lower bias for θ and substantially shorter interval estimates for θ with somewhat higher coverage than nominal level in comparison to alternative methods.
The simulation results displayed in Figure 4 are somewhat in contrast to the results given by Friede et al, 18 who observed lower coverage than nominal level of MLE-methods in a similar setting, but not based on rare events. We also investigated a scenario closer to their setup by considering higher baseline risks between 0.05 and 0.20. The results are shown in Figure 9 in the Appendix C, and indeed here the MLE method exhibits lower coverage than nominal level, as reported by Chung et al 16 and Friede et al. 18 The high bias and too wide interval estimates obtained by the Vague and the MLE are still present, but not as high as in the results of the simulations in which true baseline risks are relatively lower.
The introduced approach is not restricted to the BNHM as introduced above; similar approaches may analogously be defined in similar cases, e.g. a Poisson-normal hierarchical model. However, a crucial point is that the treatment effect parameter is explicitly parameterized in the model, so that it can directly be "penalized" via the prior specification. Hence, so-called contrast-based models 22 (in which relative treatment effects are assumed to be exchangeable across trials) are suitable for this purpose unlike arm-based models.
Jackson et al 4 investigated 7 random effects meta-analysis models including the BNHM which we consider in this paper ("Model 4" in Jackson et al 4 ) and another parametrization of the BNHM ("Model 2" in Jackson et al 4 ). The only difference between two models is that in their Model 2, the treatment indicator x ik of (1) is +1 for the experimental arm, and 0 for the control arm. Note that commonly used network meta-analysis models for example 42 are generalizations of Model 2 in Jackson et al. 4 As reported by Jackson et al, 4 we also observe the underestimation of the heterogeneity parameter τ , and hence decided to only consider their Model 4. On the other hand, it is important to note that the usage of a weakly informative prior for θ also improves the performance in Model 2, as we have seen for the Model 4.
The BNHM can be extended to a network meta-analysis model 43 which is desirable if there are multiple treatments, and/or multi-arm trials in the dataset. Even if the data set in a network metaanalysis consists of many studies overall, some of the treatment effects may still be informed by few studies only. Thus, the use of WIPs for treatment effect parameters in the context of network metaanalysis with rare events can be very helpful. One may find it restrictive to have a normal prior for θ, it may be worth exploring alternatives like Cauchy or log-F distributions 14,11 for penalization. Different distributions as WIP for θ, different parametrizations of BNHM, or different data models can be implemented in Stan or MCMC methods in general. Although, currently, our package MetaStan is restricted to use a BNHM for pairwise meta-analysis, it can be extended to conduct meta-analysis and network meta-analysis with flexible data model and prior options. meta_stan is the main fitting function of this package. The main computations are executed via the rstan package's sampling function. We can fit the binomial-normal hierarchical using a weakly informative prior for treatment effect as follows: bnhm.wip.CrinsPTLD.stan <-meta_stan(ntrt = dat.Crins2014.PTLD$exp.total, nctrl = dat.Crins2014.PTLD$cont.total, rtrt = dat.Crins2014.PTLD$exp.PTLD.events, rctrl = dat.Crins2014.PTLD$cont.PTLD.event, tau_prior_dist = "half-normal", tau_prior = 0.5, delta = 250, chains = 4, iter = 2000, warmup = 1000) Convergence diagnostics and the results can be, very conveniently, obtained using shinystan package as follows: library("shinystan") ## Firstly convert "stan" object to a "shinystan" object bnhm.wip.CrinsPTLD.shinystan = as.shinystan(bnhm.wip.CrinsPTLD.stan$fit) launch_shinystan(bnhm.wip.CrinsPTLD.shinystan) Traceplots for the estimated parameters θ and τ including burn-in are shown in Figure 7 and Figure 8 for death and PTLD outcomes, respectively.
Lastly, the posterior summary statistics can be obtained using the following command: bnhm.wip.CrinsPTLD.stan$fit_sum B R code to implement BNHM using the MLE method

C Additional simulation results
We also conducted simulations using the same settings as described in Section 5 under BNHM, but using higher baseline risk probabilities, specifically, baseline risks (µ i ) are uniformly taken between 0.05 and 0.2. Results are illustrated in Figure 9 (analogous to Figure 5).

Highlights
What is already known: Standard meta-analysis methods are not suitable for meta-analysis of few studies with rare events. What is new: To deal with data sparsity present in the meta-analysis of few studies with rare events, we suggest the use of weakly informative priors as penalization for the treatment effect parameter. Potential impact for RSM readers outside the authors' field: To make it more accessible to meta-analyzers, a publicly available R package, MetaStan, is developed for fitting Bayesian metaanalysis models using weakly informative priors.