Zero‐cell corrections in random‐effects meta‐analyses

The standard estimator for the log odds ratio (the unconditional maximum likelihood estimator) and the delta‐method estimator for its standard error are not defined if the corresponding 2 × 2 table contains at least one “zero cell”. This is also an issue when estimating the overall log odds ratio in a meta‐analysis. It is well known that correcting for zero cells by adding a small increment should be avoided. Nevertheless, these zero‐cell corrections continue to be used. With this Brief Method Note, we want to warn of a particularly bad zero‐cell correction. For this, we conduct a simulation study comparing the following two zero‐cell corrections under the ordinary random‐effects model: (a) adding 12 to all cells of all the individual studies' 2 × 2 tables independently of any zero‐cell occurrences and (b) adding 12 to all cells of only those 2 × 2 tables containing at least one zero cell. The main finding is that correction (a) performs worse than correction (b). Thus, we strongly discourage the use of correction (a).


| INTRODUCTION
In a clinical trial with a binary outcome and a binary group indicator (here: "treatment" and "control" group), the standard log odds ratio (log OR) estimator is not defined if there is at least one cell with a frequency of zero (a "zero cell") in the corresponding 2 × 2 table.
To circumvent the zero-cell problem in the context of a single study, Haldane 1 and Anscombe 2 proposed to add 1 2 to all cells of the 2 × 2 table (no matter if there is a zero cell or not). 3,4 Here, we will call this the "Haldane-Anscombe" (HA) correction. Alternatively, one may add 1 2 to all cells of the 2 × 2 table only if there is at least one zero cell. Here, we will call this the "modified Haldane-Anscombe" (mHA) correction.
Here, we focus on meta-analyses, estimating the overall log OR using the ordinary random-effects (RE) model. We assume that the 2 × 2 tables of all the individual studies are given. Obviously, the zero-cell problem carries forward from single studies to meta-analyses, especially when investigating rare events such as adverse events or rare diseases. 5 Of course, zero-cell corrections should be avoided in the first place, 6,7 for example, by using the arcsine difference as an alternative effect measure 8  binomial-normal hierarchical model (BNHM) instead of the ordinary RE model which is a normal-normal hierarchical model (NNHM). The BNHM may be used either in a Bayesian framework [9][10][11] or in a frequentist framework. [12][13][14] More extensive overviews of meta-analytic methods avoiding a zero-cell correction are given by Kuss 15 and Efthimiou. 16 However, despite the fact that the problems of zerocell corrections are well known, they are still used in practice (see for example, Chappuis et al 17 ) which might be due to the fact that common guidelines have not been fully adopted yet [18][19][20] and that many statistical software packages still use zero-cell corrections by default. 19 Thus, we think it is important to warn the scientific community of a particularly bad zero-cell correction-and we observed that the HA correction performs particularly badly compared to the mHA correction.
In the context of a single study, the statistical properties of the HA and the mHA correction have already been compared: Walter 21 showed that the (absolute) bias and the mean squared error of the HA correction are usually smaller than those of the mHA correction. However, Walter 21 only referred to the estimation of the log OR in a single study. Here, we focus on the estimation of the overall log OR in meta-analyses.
In the context of meta-analyses, some zero-cell corrections have already been compared, 6,22 but to our knowledge, the performance of the HA and the mHA correction has not been compared directly. Sankey et al 22 do not include the mHA correction while Sweeting et al 6 do not include the HA correction. We note that Bhaumik et al 23 use both, the HA and the mHA correction, but they use different meta-analytic methods for the HA and the mHA correction, so their work cannot be used for a comparison of the HA and the mHA correction.
In this Brief Method Note, we therefore compare the performance of the HA and the mHA correction in terms of the bias induced in a point estimator for the overall log OR as well as in terms of the coverage and the interval length of an interval estimator for the overall log OR. This is achieved by the help of a simulation study.
Our work is structured as follows: In Section 2, we shortly describe the statistical methods necessary for the simulation study whose results will be presented in Section 3. Section 4 concludes with a discussion.

| METHODS
Consider a meta-analysis of K ≥ 2 independent studies, each with the binary outcome "event/no event". Let θ k denote the true log OR in study k ∈ {1, …, K} for comparing the treatment group against the control group.

| Zero-cell corrections
Consider the 2 × 2 table for study k in Table 1.
with log denoting the natural logarithm. Accordingly, the standard error (SE) ofθ k is typically estimated by the delta-method estimator 4,24 As is standard practice for meta-analyses, 25 we treat the true SE in study k as a known constant equal to its estimate, the realization ofν k .
Instead ofθ k andν k , the HA correction defines 1-4 for all k, respectively. In contrast, the mHA correction introduces the same modifications only if When conducting a meta-analysis with the R 27 package metafor, 28 the HA correction is obtained by choosing the value "all" for metafor::rma()'s argument to (assuming argument add is left at its default value 1 2 ). The default for argument to (the value "only0") is the mHA correction.
In the following description of the methods, we will useθ k as a placeholder for eitherθ k ,θ The same holds forν k as a placeholder forν k ,ν respectively.

| Sidik-Jonkman estimator
Based on the results from Weber et al, 31 we will use the Hartung-Knapp/Sidik-Jonkman (HKSJ) confidence interval (CI) [32][33][34] (which will be defined in Section 2.4) for the interval estimation of θ. Correspondingly, 31 we will use the Sidik-Jonkman (SJ) estimator 35 for estimating τ 2 . The SJ estimator is defined througĥ and the generalized Q statistic

| Hartung-Knapp/Sidik-Jonkman confidence interval
The HKSJ CI is defined through withτ 2 denoting an arbitrary estimator for τ 2 (here: In the following, the HKSJ CI using the SJ estimator for τ 2 will be abbreviated by "HKSJ-SJ CI".

| Simulation setup
The setup of our simulation study corresponds to that of the log OR part from Weber et al. 31 Briefly, we choose α = 0.05, K ∈{2, 3, 5, 8, 13, 21, 34, 55}, θ = 0.5, τ 2 ∈{0, 0.05, 0.15, 0.5, 1.5}, and randomly drawn "small," "large," and "mixed" treatment and control group sizes N T,k = N C,k (see Weber et al 31 for details). For the generation of the true study-specific log ORs θ k analogously to Hartung and Knapp, 33 we use an "overall control group event probability" of p C ∈{0.1, 0.5} (see Weber et al 31 for details). For each scenario, we estimate the bias, the coverage, and the interval length by averaging over 5000 replications.

| RESULTS
The simulation results for the bias are shown in Figure 1. The plots for the coverage and the interval length may be found in the Supporting Information 1 (Figures S1 and S2, respectively). In all these figures, we use multiple nested loop plots 36 arranged in a grid. The key property of a nested loop plot is a nesting of different simulation parameters on the x-axis. Here, we nest p C in K. The performance measure (either the bias, the coverage, or the interval length) will be plotted on the y-axis, with the interval length being plotted on logarithmic scale. We include the Monte Carlo standard error (MCSE) of the performance measures using semitransparent bands.
First of all, Figure 1 shows that the absolute bias under both zero-cell corrections is usually larger for p C = 0.1 than for p C = 0.5. This is not surprising as the probability of low frequencies increases as the event probability departs from 0.5. And the lower a frequency, the higher the impact of adding a constant to it (even if this constant is small). Secondly, the bias is typically negative (especially for the HA correction), most probably a result of the fact that adding 1 2 to all cells of the 2 × 2 table of study k generally pushesθ k closer to 0 (whereas the true overall value used here was θ = 0.5, see Section 2.5). Thirdly and most importantly, the absolute bias of the HA correction is nearly always larger than the mHA correction's one. Only for scenarios with K ∈ {2, 3} and p C = 0.5, the absolute bias of the HA correction is usually smaller than the mHA correction's one. In these scenarios, however, the differences in absolute bias are rather small compared to those scenarios where the mHA correction performs better.
The larger absolute bias of the HA correction has a considerable impact on the coverage (see Figure S1): For small and mixed N T,k , large K, and p C = 0.1, the coverage under the HA correction is well below the nominal level of 0.95 whereas the coverage under the mHA correction is close to the nominal level. The impact of the bias on the coverage increases with increasing K which may be due to the decreasing interval length for increasing K (see Figure S2) as a shorter length allows a larger impact of the bias on the coverage. Besides, the absolute bias usually increases with increasing K (see Figure 1), but this increase is small and thus, we do not consider this to be the main driver of the reduced coverage.
We note that there are only a few scenarios where the HA correction performs better in terms of coverage, but the advantage of the HA correction in these scenarios is smaller than the advantage of the mHA correction in those scenarios where the mHA correction performs better.
We also note that in the few scenarios named above where the HA correction has smaller absolute bias, the coverage is nearly identical between the HA and the mHA correction (especially when taking into account the Monte Carlo error).  Figure S2 shows that the interval length under the HA correction is slightly smaller than under the mHA correction (again especially for small and mixed N T,k ). This may be due to a smaller SE ofθ k (given θ k ) under the HA correction (see Equations (4) and (6) where the denominators of the SE estimator are increased more frequently under the HA correction than under the mHA correction). Smaller SEs of theθ k (given the θ k ) result in a smaller variance 1 w τ 2 ð Þ ofθ τ 2 ð Þ. To conclude, all these results show that the HA correction performs worse than the mHA correction.

| DISCUSSION
When estimating the overall log OR in the meta-analytic RE model, we have shown that the HA correction (i.e., always adding 1 2 to all cells of the study-specific 2 × 2 tables) should not be used as a zero-cell correction. Our simulation results also show that this warning is especially important in case of rare events (or rare complementary events). Nevertheless, we repeat that avoiding a zero-cell correction in the first place is clearly recommended (see Section 1).
As a comparison to the HA correction, we used the mHA correction (i.e., adding 1 2 to all cells of a 2 × 2 table only if there is at least one zero cell in this 2 × 2 table). The fact that our results (showing an inferiority of the HA correction in meta-analyses) are differing from the results from Walter 21 (showing an inferiority of the mHA correction in a single study) is somewhat surprising. Future studies should investigate this disagreement in detail.
Our simulation study was kept simple since the purpose of this Brief Method Note was to highlight a problem with the HA correction (which we have encountered under specific, but still realistic circumstances). For example, we did not investigate unbalanced group sizes, meta-analytic methods other than the HKSJ-SJ CI, or an overall log OR of θ ≠ 0.5. Especially with respect to the meta-analytic method (including the τ 2 estimator), other choices might lead to different findings.
Concerning the first one of these possible extensions, it has been noted that zero-cell corrections are especially problematic in case of unbalanced group sizes 5,6,8 with an increasing deterioration for increasing imbalance. 6 This was shown for the mHA correction as well as for the "group size correction" 8 proposed by Sweeting et al 6 (with the "group size correction" performing slightly better in case of unbalanced group sizes) 6 and we assume that this also holds for the HA correction, but this remains to be shown.

What is already known
• For a binary outcome and a binary group indicator, the occurrence of at least one "zero cell" in the corresponding 2 × 2 table makes the estimation of the log odds ratio impossible. • Zero-cell corrections (consisting of the addition of a small increment to the cell frequencies) should be avoided since they worsen the performance of the statistical methods building upon them. Despite this warning, zero-cell corrections are still used in practice.

What is new
• In the context of meta-analyses, we show that adding ½ to all cells of all the study-specific 2 × 2 tables leads to a particularly bad performance compared to adding ½ to all cells of only those 2 × 2 tables containing at least one zero cell.

Potential impact for RSM readers outside the authors' field
• If a zero-cell correction is absolutely required, then adding ½ to all cells of all the study-specific 2 × 2 tables is not recommended. However, zero-cell corrections should be avoided in the first place by more appropriate statistical methods.