Reporting ‘number needed to treat’ in meta-analyses: A cross-sectional study
Associate Professor Cho Naing, School of Postgraduate Studies and Research, International Medical University, Kuala Lumpur 57000, Malaysia. Tel: +603-8656-7228 Fax: +603-8656-7229 Email: firstname.lastname@example.org
Aim: In translating clinical research into practice, the summarization of data from randomized trials in terms of measures of effect to be readily appreciated by the point-of-care clinicians is important. In this context, the body of literature highlighted the ‘number needed to treat’ as a useful measure. The objectives of our study were to assess how meta-analyses described number needed to treat and corresponding 95% CI, and to explore issues related to reporting number needed to treat in the selected meta-analyses.
Method: For an illustration, we searched for the Cochrane systematic reviews and non-Cochrane systematic reviews. Two-stage selection was done to identify eligible studies. First, we fixed a date and then, we searched meta-analyses in PUBMED available on the date fixed. Secondly, we purposively selected five Cochrane systematic reviews and three non-Cochrane systematic reviews, according to our inclusion criteria. The critical appraisal of meta-analyses identified for the current study was done with the 5-item quality checklist introduced to the current analysis.
Results: A total of 8 systematic reviews, 5 Cochrane systematic reviews and 3 non-Cochrane systematic reviews/meta-analyses, were identified for the present study. Of these 8 meta-analyses, some (50%; 4/8) described number needed to treat in the method session of the study. However, the majority (87.5%; 7/8) reported number needed to treat in the results. For the details, 80% in Cochrane reviews and 66.5% in non-Cochrane reviews reported number needed to treat in the results. Only two studies (25%; 2/8) reported susceptibility to publication bias, provided simplified interpretation or discussed number needed to treat.
Conclusion: Although the Cochrane handbook for systematic reviews of interventions suggests the reviewers to include number needed to treat in reporting effect estimations, there still is a need to improve.
A pivotal step in translating clinical research into practice is the summarization of data from randomized trials in terms of measures of effect that can be readily appreciated by the point-of-care clinicians (1). As such, the most commonly encountered effect measures used in clinical trials with dichotomous data are risk ratio (relative risk, RR), odds ratio (OR), risk difference (or absolute risk reduction, ARR), and number needed-to-treat (NNT) (2, 3). The former two represents relative measures, while the latter two are absolute measures. As a matter of fact, each effect measure has its merits and limits of statistical property. Of these, RR, OR, and ARR are used extensively in both clinical and epidemiological investigations. RR and OR are particularly useful when considering trade-off between likely benefits and likely harms of intervention (3), albeit with existence of debate between RR and OR, and is beyond the scope of this study. However, in clinical decision making, it is more meaningful to use the measure ‘NNT’. This measure is simply calculated on the reciprocal of the ARR, or 1/ARR (or 100/ARR if percentages are used rather than proportions) (1, 2, 4), for any trial, which has reported a binary outcome. When the treatment effect is not statistically significant, the 95% CI for the ARR spans zero, and one limit of the CI for the NNT will be negative. In this case the inverse of ARR is often termed as the number needed-to-treat-to-harm (NNH). Hence, in the case of positive 95% CI, it is better to be termed as number needed-to-treat-to-benefit (NNB) (1, 2). The 95% CI is simply by taking reciprocals of the values defining the CI for the ARR (2, 3) and it is recommended to be used in reporting effect measures in meta-analyses. For simplicity sake, we would prefer the term NNT, encompassing both NNB and NNH, which is often described in the published reviews. NNH reflects detrimental events indicating deterioration in outcome, while NNB is beneficial events of improvements in outcomes. The NNT is defined as the expected numbers of people who need to receive the experimental intervention rather than the comparator for one additional person to either incur or avoid an event in a given time frame (3). Computing NNT by converting from RR, OR, and ARR to NNT is described in detail elsewhere (1-3, 5). The illustration of this study is mainly concerned with conversion of ARR to NNT on the fixed follow-up periods. NNT on time-specific follow-ups is beyond the scope of this work. It is important to note the two unique characteristics of NNT in studies with dichotomous outcomes: (1) since the NNT is derived from the ARR, it is still a comparative measure of effect (ie, experimental versus control) and not a general property of a single intervention, and (2) the NNT is an expected value.
For dichotomous outcomes, the results should be, where possible, expressed as NNT with 95% CI and the baseline risk to which it applies (6). Moreover, the body of literature suggests NNT is applicable in assessing the likelihood of publication bias (ie, susceptibility to publication bias). This is because of the number of participants in studies with zero effect (relative benefit of one) would be needed to give a NNT too high to be clinically relevant (7).
Taking together, the objectives of our study were to assess how studies on meta-analyses described NNT and corresponding 95% CI, and to explore issues related to reporting NNT in the selected meta-analyses.
Table 1 provides a computation framework of NNT, while Table 2 illustrates imputation of NNT using hypothetical data. In order to assess the susceptibility to publication bias, how many data (trials, participants) would be required both to be unpublished and to have zero treatment effect (relative risk/benefit 1) to make any result clinically irrelevant was estimated, using the NNT of predetermined utility. Table 3 demonstrates the stepwise calculation of susceptible publication bias using arbitrary NNT value (ie, predetermined utility value).
Table 1. Illustration of the computation framework of NNT in meta-analyses
|Study 1||A/n1||E/n6|| || || |
|Study 2||B/n2||F/n7|| || || |
|Study 3||C/n3||G/n8|| || || |
|Study 4||D/n4||H/n9|| || || |
|Study 5||E/n5||I/n10|| || || |
|Study 6||E/n6||I/n12|| || || |
|Summary measures in meta-analyses||X = (A + B + C + D + E)/ (n1+ n2+ n3+ n4+ n5+ n6)||Y = (E + F + G + H + I)/(n7+ n8+ n9+ n10+ n11+ n12)||X/Y||X – Y||1/ (X – Y)|
Table 2. Illustration of number needed to treat in meta-analyses
|Study 1||73/100||3/25|| || || |
|Study 2||73/97||18/46|| || || |
|Study 3||61/76||6/49|| || || |
|Study 4||37/50||6/50|| || || |
|Study 5||62/100||1/50|| || || |
|Study 6||26/80||10/175|| || || |
| || || ||(4.18–7.93)||(0.48–0.59)||(1.5–2)|
Selection of studies
Two-stage selection was done to identify eligible studies. First, we fixed a date (with regard to the convenient schedule of the authors of this study) and then, we identified meta-analyses in PubMed on the date fixed. In order to obtain the eligible studies on meta-analyses, we applied free search terms, ‘meta-analysis’ and ‘randomized controlled trials’. Second, we purposively selected five Cochrane systematic reviews and three non-Cochrane systematic reviews. The selection criteria were systematic reviews which included (1) quantitative synthesis; (2) sufficient data to calculate RR and ARR; (3) individual RCTs with pharmacological interventions; (4) only one review from each Cochrane group entity published in 2012; (5) only one article from each journal publication in 2012; (6) free access to the full articles; and (7) published in English.
Two authors independently assessed the reporting of NNT in the selected meta-analyses. The critical appraisal of meta-analyses identified for this study was done with the 5-item quality checklist, based on literature review and consultations with the researchers. The 5-item scale is as follows: (1) Does the meta-analysis describe NNT appropriately in methods? (2) Is the meta-analysis reported NNT in results? (3) If reported, is the susceptibility to publication bias based on NNT included? (4) Is the NNT interpreted in simplification? (5) Does the meta-analysis comment NNT in discussion? We gave ‘yes’, ‘unclear’ (ie, incomplete), or ‘no’ to each item. For any discrepancy between the two authors, we resolved with discussion and by taking advice from the senior author. If NNT was described, we checked the reported NNT, following the process described in Table 2 and 3, using an example dataset (8). For this study, both data entry and data analysis were done with Excel spreadsheet. The findings of this study were reported according to the statement of strengthening the reporting of observational studies in epidemiology (STROBE; supporting material) (9).
Table 3. Detection of zero treatment effect trials
|1||How many patients in the data set (RCT or meta- analyses)?||A||798|
|2||What was the NNT obtained in the RCT or meta-analyses?||B||1.8|
|3||What NNT value would be the limit of clinical utility or acceptability?||C||12a|
| || || Calculation b |
|4||Divide step 1 by step 2||A/B = D||434.7|
|5||Multiply step 4 by step 3||D × C = E||5217|
|6||Subtract step 1 from step 5||E – A = F||4419|
|(Extra patients needed for the NNT to move from calculated NNT to the set NNT)|
A total of eight systematic reviews, comprising of five Cochrane systematic reviews (8, 10-13), and three non-Cochrane systematic reviews/meta-analyses (14-16), were identified for this study. Table 4 shows the description of the selected meta-analyses. Of eight meta-analyses, the majority (87.5%, 7/8) reported NNT in the results although only four studies (50%) described NNT in the method session. For the details, 80% (4/5) in Cochrane reviews and 66.7% (2/3) in non-Cochrane reviews reported NNT in the results. Only two studies (25%, 2/8) reported susceptible to publication bias, provided simplified interpretation or discussed NNT.
Table 4. Characteristics of the selected studies
|Cochrane||(8) (oral etoricoxib)||yes||yes||yes||yes||yes|
|Cochrane||(10) (oral varenicline)||no||yes||yes||no||yes|
|Cochrane||(11) (UFH, LMWH)||yes||yes||no||no||no|
|Non-Cochrane||(15) (prophylatic antibiotics)||no||yes||no||yes||yes|
|Non-Cochrane||(16) (lidocaine/tetracaine patch)||yes||yes||yes||unclear||no|
Although the Cochrane Handbook for Systematic Reviews of Interventions suggests the reviewers to compute NNT in reporting effect estimations, there still is space for improvement. On the other hand, we were aware of a merit of two reviews identified for this study (12, 16), reporting NNT in their abstracts. Information in abstract would reach a greater number of consumers as well as stakeholders. In the context of evidence-based medicine, with NNT increasingly used in comparative effectiveness studies of therapies, its accurate estimation and correct interpretation are crucial (17). In the current analysis, the Cochrane review of Cahill (10) had highlighted the important factors in imputation of NNT; “The number needed to treat to benefit to achieve each additional successful quitter can be derived from the pooled difference between placebo and treatment quit rates. However, absolute quit rates vary considerably between trials, according to the definition of cessation, length of follow-up, the population treated, and the extent of the counseling and follow-up support given.”
Of note is that if any meta-analysis did not compute NNTs, the reason might be due to varying follow-up times. For any meta-analysis computing NNTs, it must be based on fixed follow-up times. This was an important precaution in summarizing data in meta-analyses when individual trials with unequal follow-up times were identified. In intervention studies of pharmacology discipline, drug effects could accumulate overtime, describing the likelihood of changes of the effect. Studies have highlighted that the computation of the NNTs can, however, be inaccurate and its interpretation misleading in trials with varying follow-up times (17, 18). Despite intuitively appealing measure of the effect of a treatment, computation of NNT must be performed with care in trials with varying follow-up times (18).
Regarding the reporting of NNT, it was found that not all meta-analyses included in this study reported the corresponding 95% CIs of NNT. As a general principle, the CI describes the uncertainty inherent in estimates, and within which we can be reasonably sure that the true effect actually lies (3). If it is in fixed-effect model, its 95% CI addresses the question, “if there is a single population treatment effect across all the trials, what is the best estimate (and uncertainty) for this common treatment effect?” (19). If it is from random-effects model, it addresses the question, “what is the best estimate of the average effect?”
Regarding the interpretation of NNT, not all studies provided simplified interpretation of NNT. An NNT of 9, as an example, can be interpreted as “it is expected that one additional person will incur an event for every nine participants.” NNT = 9 does not imply that one additional event will occur in each and every group of nine people. In essence, a large treatment effect, in the absolute scale, leads to a small number NNT (2). A treatment that is expected to save one life for every nine patients treated is better than a competing treatment that saves one life for every 68 treated. In general, where efficacy is high, the requirement is for small numbers of patients. Where efficacy is low, the requirement is for large numbers of patients. With regard to unwanted effects, the NNH should be as large as possible when assessing the safety of the intervention.
When we assessed the strength of evidence for each outcome, the total number of participants contributing data, the methodological quality of studies and the degree of heterogeneity are crucially important criteria. It is also to be considered the number of additional participants needed in studies with zero effect (relative benefit of one) required to change (7), based on the arbitrary value of NNT. None of the meta-analyses reported susceptibility to publication bias, based on clinical utility of NNT, except the Clarke Cochrane Review (8). With regard to the NNT of predetermined utility (arbitrary NNT based on experts in some case), it was typically taken to be an NNT of 8 or greater in meta-analyses of pain related conditional analysis (7), assuming 10 for this study. We found that, in this study information on publication bias based on NNT was provided only in the Clarke review (8); “We carried out a comprehensive search for relevant studies, and investigated the potential influence of publication bias by examining the number of participants in studies with zero effect (relative risk of 1.0) needed for the point estimate of the NNT to increase beyond a clinically useful level. In this case, we chose a clinically useful level as eight. For the primary outcome of at least 50% pain relief with etoricoxib 120 mg, more than 5000 participants (ie, 5217, see Table 3) would have to have been involved in unpublished trials with zero treatment effects for the NNT to increase above this threshold. It is highly unlikely that this amount of missing data exists.” In the context of knowledge translation, understandable information would be valuable for the knowledge consumers, who are also point-of-care clinicians.
Overall, the findings of this study suggest that (1) if not all, many meta-analyses report NNT, (2) even if reported, information on NNT was inadequate, and (3) even if reported adequately, discussion on NNT was not done adequately. We acknowledge the caveats of our study. Being a cross sectional design, generalizability is limited. It is likely to miss meta-analyses with proper reporting of NNT. Further studies with larger samples are recommended.
Our findings serve as timely reminder that NNT should be simply computed in equal follow-up times and its corresponding 95% CIs. NNT provide magnitude of the clinical effect which is important for point-of-care clinicians, and the healthcare decision makers in the context of the best utilization of resources
We are grateful to the participants and the researchers of the primary studies used for the present analysis. We wish to thank the International Medical University (IMU), Malaysia for allowing us to perform this study.