Footprint of publication selection bias on meta‐analyses in medicine, environmental sciences, psychology, and economics

Publication selection bias undermines the systematic accumulation of evidence. To assess the extent of this problem, we survey over 68,000 meta‐analyses containing over 700,000 effect size estimates from medicine (67,386/597,699), environmental sciences (199/12,707), psychology (605/23,563), and economics (327/91,421). Our results indicate that meta‐analyses in economics are the most severely contaminated by publication selection bias, closely followed by meta‐analyses in environmental sciences and psychology, whereas meta‐analyses in medicine are contaminated the least. After adjusting for publication selection bias, the median probability of the presence of an effect decreased from 99.9% to 29.7% in economics, from 98.9% to 55.7% in psychology, from 99.8% to 70.7% in environmental sciences, and from 38.0% to 29.7% in medicine. The median absolute effect sizes (in terms of standardized mean differences) decreased from d = 0.20 to d = 0.07 in economics, from d = 0.37 to d = 0.26 in psychology, from d = 0.62 to d = 0.43 in environmental sciences, and from d = 0.24 to d = 0.13 in medicine.


Introduction
Publication selection biases (PSB) are defined as the selective reporting of results in ways that deviate from the objective, complete scientific record.PSB may entail the suppression of "negative" findings or the conversion of "negative" results into more "positive" ones (e.g., those with more favorable p-values and/or with larger effect sizes) and might represent a problem in all scientific disciplines, e.g., [1,2,3,4,5].Studies that examine the self-reported behavior of researchers show that 78% of researchers failed to report all dependent measures of a study [6] (however, see [7], for a response that suggests a lower proportion).Some studies also suggest that PSB might be modestly increasing in some areas, although the exact nature, prevalence, and impact of PSB is unknown and likely to be variable across scientific fields [8,9].
To gauge the extent of the PSB, one would need to have access to the complete scientific record or a representative and wide-coverage sample of it.However, this is infeasible as much of the relevant data is not publicly recorded.Instead, the footprint of PSB is indirectly probed by re-analyzing meta-analyses in several specific fields with different statistical techniques [9,10,11,12,13,14] and focusing on patterns in the published results that would herald the presence of PSB.All these available methods try to identify the footprint of PSB, and thus their results need to be interpreted with caution since these patterns (e.g., correlations of effect sizes and standard errors) may sometimes be due to factors other than PSB (e.g., genuine heterogeneity across studies).However, when large numbers of meta-analyses show the same patterns, this constitutes a probable footprint of PSB, which can be used to estimate its relative magnitude across different fields.
Previous field-wide assessments of PSB suggested that the prevalence of over-reporting positive results and other possible symptoms of bias increased moving from the physical to the biological and the social sciences, and even suggested that problems might be worsening over time in the latter [9,15,16,17,18,19,20].However, these estimates were based on proxy measures of PSB that have several limitations.
To our knowledge, no previous survey of the potential footprint of PSB has used state-of-the-art methods.Our proposed approach is more comprehensive than past surveys: employing different strategies to identify potential PSB, using new measures of PSB, and analyzing a much larger number of research studies covering the fields of medicine, environmental sciences, psychology, and economics.

Data Sets
We used five large data sets from medicine, environmental sciences, psychology, and economics.The data set from medicine comprises meta-analyses of continuous and dichotomous outcomes obtained from the Cochrane Database of Systematic Reviews published between 1997 and 2020.The data set from environmental sciences comprises the meta-analyses of mean differences, odds ratios, and correlation coefficients by Deressa and colleagues [21] published between 2010 and 2020.The data sets from psychology comprise the meta-analyses of mean differences and correlation coefficients by Stanley and colleagues [12] published between 2011 and 2016 combined with a random sample of meta-analyses published in psychological journals by Sladekova and colleagues [22] published in 2008 and 2018.Finally, the data set from economics comprises the extended data set of meta-analyses of regression and correlation coefficients by Ioannidis and colleagues [10] published between 1967 and 2021.Eighty-four meta-analyses were part of both the [10] and [12] data set.Since each of the meta-analyses could be classified in both fields (psychology or economics), we did not remove them from either of the data sets.From each data set we only used meta-analyses with at least three estimates reported using standardized effect size metrics such as log odds ratios, standardized mean differences, and (partial) correlation coefficients that can be transformed to a common standardized mean difference effect size metric, Cohen's d.

Medicine
The data set from medicine comprises meta-analyses of continuous and dichotomous outcomes obtained from the Cochrane Database of Systematic Reviews (CSDR) published between 1997 and 2020.We identified systematic reviews in the CDSR through PubMed, limiting the period to Jan 2000 -May 2020.For that, we used the NCBI's EUtils API with the following query: "Cochrane Database Syst Rev"[journal] AND ("2000/01/01"[PDAT]: "2020/05/31"[PDAT]).For each review, we downloaded the XML meta-analysis table file (rm5-format) associated with the review's latest version.We extracted the tables with continuous and dichotomous outcomes from these rm5-files with a custom Javascript and R programs (https://github.com/wmotte/cochrane2022).
We selected meta-analysis tables based on the highest aggregation reported in the CSDR.For each meta-analysis, we removed estimates based on one or fewer participants in the control or treatment group and used all meta-analyses with at least three effect size estimates.

Environmental Sciences
The environmental sciences data set consists of meta-analyses of mean differences, correlation coefficients, and odds ratios published between 2010 and 2020.The literature search was performed in the Scopus database using the query: "TITLE-ABS-KEY ("meta analy*" OR "meta-analy*" OR "metaanaly*" OR "meta reg*" OR "meta-reg*" OR "metareg*") AND SUBJAREA (envi)" on July 21, 2020.Detailed information about the sampling strategy and inclusion/exclusion criteria used can be found in [21].

Psychology
The data set from psychology comprise the data set of meta-analyses of mean differences and correlation coefficients of Stanley and colleagues [12] published between 2011 and 2016 combined with data from Sladekova and colleagues [22], a random sample of 433 meta-analyses from 90 articles published in 2008 and 2018.See [12] and [22] for more details about the collected data sets.None of the meta-analyses by [22] were published in Psychological Bulletin, precluding overlap with Stanley and colleagues [12] data set.

Economics
The data set from economics comprise the extended data set of meta-analyses of regression and correlation coefficients of Ioannidis and colleagues [10] published between 1967 and 2021.The meta-analyses were identified using various search engines (e.g.. Econlit and Scopus), publisher sites (e.g., Science Direct, Sage, and Wiley), webpages of researchers known to publish meta-analyses, and by searching all volumes of individual journals that are known to publish meta-analyses.We also emailed 109 research teams (associated with either sole authored or co-authored meta-analyses) for data, with a 67% response rate.The search for data ended May 30th, 2021.
We selected meta-analyses of standardized mean differences, (partial) correlation coefficients, and mean differences (if enough information was available to compute the standardized mean differences).

Effect Size Calculation
In cases where the data set did not already feature standardized effect size (Cohen's d, correlation coefficient r, log(OR), or Fisher's z), we used the metafor R package [23] to calculate the standardized effect sizes.For dichotomous outcomes with zero cell counts, we used the default empty cell correction, adding 1/2 to empty cells.Finally, we converted all standardized effect sizes to Fisher's z by using the formulas in [24].

Publication Bias Adjustment with Bayesian Model-Averaging
We used the PSB detection and correction technique RoBMA-PSMA [25,26].RoBMA employs Bayesian modelaveraging [27,28] and combines the best of two well-performing publication bias adjustment methods: selection models with six different weight functions that adjust for publication selection across a combination of statistical significance and direction of the effect [29] and PET-PEESE , which adjusts for the relationship between effect sizes and standard errors or standard errors squared [30].Bayesian model-averaging allows us to combine these publication bias adjustment methods based on their predictive adequacy, such that models that predict well have a larger impact on the inference.In that way, we can evaluate the evidence in favor or against the hypothesis of PSB and its impact without committing to any single estimation or correction method [27].We used the default RoBMA parameterization which was shown to achieve better performance in both simulation studies and real data examples than either of publication bias adjustment methods alone [26].It gives equal prior model probabilities to models assuming the presence vs. absence of an effect, heterogeneity, and publication selection bias.RoBMA employs a standard normal distribution on the effect size, µ ∼ Normal(0, 1), empirically informed Inverse-gamma distribution on the heterogeneity, τ ∼ Inverse-Gamma(1, 0.15) [31], cumulative unit Dirichlet prior distributions on publication probabilities, and Cauchy prior distributions on the PET-PEESE regression coefficients, PET ∼ Cauchy + (0, 1), PEESE ∼ Cauchy + (0, 5).

InfoBox 1: Bayes factors
The Bayes factor is the key inference criterion for much of Bayesian statistics, e.g., [32,33].It compares the relative predictive accuracy (i.e., likelihood of the data) under competing hypotheses (e.g., H1 vs. H0) and it can also be expressed as the ratio of prior and posterior model odds, p(data | H1) p(data | H0) .
Although the Bayes factor is a continuous measure of strength of evidence, the following rules of thumb may aid interpretation: Bayes factors between 1 and 3 are commonly regarded as weak evidence, Bayes factors between 3 and 10 as moderate evidence, and Bayes factors larger 10 as strong evidence for the alternative (or the hypothesis at the top of Equation 1).When the evidence for the null is considered, the Bayes factor is simply inverted.In other words, a Bayes factor between 1/3 and 1 is considered weak evidence, a Bayes factor between 1/10 and 1/3 moderate and smaller 1/10 strong evidence for the null (e.g., [34]; [35]).

Measures
For each meta-analysis, we used RoBMA to calculate the (PSB) adjusted posterior model-averaged effect size assuming it is present (i.e., without averaging over the point null models to reduce shrinkage toward zero), µ adj,k ; publication bias adjusted posterior probability of the presence of the effect, p adj,k (H 1 | data k ); and the posterior probability of the presence of PSB, p adj,k (H psb | data k ).To isolate the effect of PSB adjustment, we compare the Bayesian, PSB unadjusted, model-averaged meta-analysis by dropping the PSB adjustment and thereby estimating the unadjusted posterior probability of the presence of the effect assuming it is present, p unadj,k (H 1 | data k ).k = 1, . . ., K to denotes the individual meta-analyses.Each meta-analysis is based on N k estimates that are characterized with data describing the effect size y k,n and standard error se k,n .

Evidence for the Effect
We used the change in the posterior probability of the effect and the (standardized) evidence inflation factor to quantify the effect of PSB on meta-analytic evidence.
The Posterior probability of the effect is an intuitive way of quantifying the evidence in favor of the alternative hypothesis of the presence of an effect.Under the assumption of equal prior probability of the presence and the absence of the effect, p(H 1 ) = p(H 0 ), posterior probabilities larger than 0.5 indicate that the data are more likely under the presence of the effect.On the other hand, posterior probabilities lower than 0.5 indicate that the data are more likely under the absence of the effect.The ability to quantify evidence for both the null and the alternative is a key benefit of Bayesian methods over null hypothesis significance testing.[36,37].
A corresponding way of quantifying the evidence of an effect is via Bayes factors (see InfoBox 1 for more detail).Bayes factors quantify the change from prior to posterior odds for the presence of the effect.The advantage of Bayes factors is that they are independent of the prior odds for the presence of the effect.In other words, Bayes factors isolate the evidence for the presence of the effect contained in the data.In our settings, the assumption of equal prior probabilities leads to an equivalence between Bayes factors and posterior odds.
The change from the PSB unadjusted posterior probability of the effect, p unadj,k (H 1 | data k ), to the PSB adjusted posterior probability of the effect, p adj,k (H 1 | data k ), quantifies the amount of evidence introduced by PSB.The larger the impact of PSB, the larger the difference between the PSB unadjusted and PSB adjusted posterior probabilities of the effect.If there was no PSB, we would observe no change in the posterior probability of the effect after PSB adjustment.
Evidence inflation factor (EIF) quantifies the degree to which the evidence in favor of the presence of the effect was inflated due to PSB.EIF k quantifies the amount of evidence in favor of the effect in the PSB unadjusted meta-analysis, BF 10,unadj,k , to the amount of evidence in favor of the effect in the PSB adjusted meta-analysis BF 10,adj,k , An evidence inflation factor larger than one indicates inflated evidence in favor of the effect due to PSB.
However, the amount of evidence contained in each meta-analysis, and the corresponding evidence inflation, is dependent on the number of meta-analyzed estimates, N n , i.e., more estimates leads to more evidence.To facilitate the comparison of evidence inflation due to PSB in meta-analyses with different numbers of estimates, we also compute the standardized, per-estimate, evidence inflation factor in each meta-analysis, sEIF k , by standardizing the EIF by the number of estimates, where sEIF represents each estimate's marginal contribution, on average, to the evidence inflation due to PSB.The sEIF also partially mitigates the potential issue of dependent estimates within a meta-analysis.In the most extreme case, e.g., identical estimates, the same data is conditioned upon multiple times, which leads to overestimation of evidence.Taking only a fraction of each estimate's likelihood, proportional to the number of estimates, then ensures that the data are not conditioned upon more than once, although data producing multiple estimates are still weighted more heavily.

Effect Size Estimates
Absolute bias (bias) quantifies the degree to which the average effect sizes in each meta-analysis, overestimates the PSB-adjusted meta-analytic effect size estimate assuming the presence of the effect µ adj,k , Absolute bias larger than zero indicates that PSB leads to inflated effect size estimates.We compare the average effect sizes to the PSB-adjusted effect sizes assuming the presence of the effect (conditional effect size estimates) rather than averaging across all models.Excluding models assuming the absence of a mean effect mitigates the pooling towards 0 in meta-analyses more consistent with the null hypothesis.Tables 5 and 6 in the Supplementary Materials use the PSB-adjusted effect sizes model-averaged across all models, including models assuming the absence of a mean effect.These estimates, which are model-averaged also over the null, indicate stronger absolute bias compared to the conditional estimates presented in the main manuscript (Table 2).
Overestimation factor (OF) quantifies the degree to which the average effect sizes in meta-analyses overestimate the PSB-adjusted effect size estimates assuming the presence of the effect, OF = An overestimation factor larger than one is evidence of PSB.We use delta method to obtain confidence interval of the overestimation factor.In the Supplementary Materials, we also report medians and interquartile ranges of per meta-analysis overestimation factors, However, note that OF k can lead to non-sensible results as a meta-analysis with a positive mean effect and very small negative PSB-adjusted effect sizes estimate results in an extremely large negative OF k .

Evidence for Publication Selection Bias
Posterior probability of PSB is an intuitive way of quantifying the evidence in favor of PSB.Similarly to the posterior probability of the presence of the effect, under the assumption of equal prior probability of the presence and the absence of PSB, p(H PSB ) = p(H NoPSB ), a posterior probability larger than 0.5 indicates that the data are more likely under the presence of PSB.On the other hand, a posterior probability lower than 0.5 indicates that the data are more likely under the absence of PSB.As before, the Bayes factor for the presence of PSB, BF psb ,1 quantifies the change from prior to posterior odds for the presence of PSB.A Bayes factor in favor of the presence of PSB larger than one provides evidence in favor of the presence of PSB and lower than one provides evidence against the presence of PSB Relative publication probabilities quantify the relative probability of an estimate being published for a given p-value interval compared to estimates with statistically significant p-values.We use one-sided p-values, resulting in p-values larger than 0.5 corresponding to estimates in the opposite direction.To facilitate the interpretation we visualize a weight function that shows the change of relative publication probabilities across the range of p-values.We report the results only in Supplementary Materials.
Effect size inflation in imprecise estimates quantifies the relationship between the effect sizes and their standard errors.To facilitate the interpretation of the funnel asymmetry test, we visualize the bias in effect sizes as a function of standard errors (incorporating the quadratic term from the RoBMA model).We report the results only in Supplementary Materials.

Descriptives
Table 1: Summary of the data sets from each field.The number of estimates per meta-analysis (Estimates/MA) and the unweighted simple mean effect size of estimates within each meta-analysis (Effect Sizes) are reported as medians with the interquartile range (in parentheses).The proportion of the statistically significant (Prop.Significant) meta-analytic effect size estimates is based on a random-effect meta-analysis estimated via restricted maximum likelihood with α = 0.05 (removing one environmental sciences and 275 medical meta-analyses that did not converge) .  1 compares the characteristics of the meta-analyses from each field.Medical meta-analyses contain the smallest number of estimates per meta-analysis, followed by psychology and environmental sciences with five to six times the number of estimates compared to medicine.Finally economics meta-analyses contain over twelve times the median number of estimates compared to medicine.Contrary to a naive expectation that more estimates may be conducted to establish smaller effects, economic and medical effect sizes are approximately the same magnitude (measured as a mean effect size per meta-analysis).Effect sizes in psychology are roughly twice as large as those in economics, and effect sizes in the environmental sciences are approximately three times larger than those in economics.Differences in the number of estimates per meta-analysis are most closely reflected in the proportion of statistically significant random-effects estimates.Notably, random-effects estimates in economics, psychology, and environmental sciences are statistically significant approximately twice as often as in medicine.This disparity in the proportion of statistically significant meta-analyses is consistent when comparing meta-analyses with a matched number of estimates across the disciplines, although the difference in mean effects is somewhat smaller (see Table 1 in the Supplementary Materials).
We summarize results from all meta-analyses, apart from seven medical meta-analyses that did not converge.See Supplementary Materials for analyses showing that matching meta-analyses based on the number of primary estimates within each meta-analysis does not meaningfully affect the conclusions.

Evidence for the Effect
Figure 1 shows medians, interquartile ranges, and distributions of the posterior probability of an effect before and after adjusting for PSB.These distributions reveal several patterns.First, meta-analyses in economics, psychology, and environmental sciences predominantly show evidence for an effect before adjusting for PSB (unadjusted); whereas meta-analyses in medicine often display evidence against an effect.This disparity between the fields remains even when comparing meta-analyses with equal numbers of effect size estimates (see Supplementary Materials).After correcting for PSB, the posterior probability of an effect drops much more in economics, psychology, and environmental sciences (medians drop from 99.9% to 29.7%, from 98.9% to 55.7%, and from 99.8% to 70.7% , respectively) compared to medicine (38.0% to 29.7%).The pattern is especially striking in economics, where the median posterior probability of an effect drops by more than seventy percentage points after PSB correction.Mean decreases in posterior probabilities show a similar pattern but with somewhat smaller reductions (Table 9 and 10 in Supplementary Materials).In all four disciplines, adjusting for PSB resulted in a substantial decrease in the strength of evidence for the effect: the proportion of meta-analyses with at least strong evidence for the presence of an effect (i.e., BF 10 > 10) decreased from 20.2% to 5.3% in medicine, from 72.4% to 30.7% in environmental sciences, from 59.8% to 27.3% in psychology, and from 72.8% to 19.6% in economics.A comparable decrease was also present when comparing the proportion of meta-analyses with at least moderate evidence for the presence of an effect (i.e., BF 10 > 3; from 28.9% to 12.3% in medicine, from 80.4% to 47.3% in environmental sciences, from 67.4% to 38.39% in psychology, and from 76.8% to 27.6% in economics).
Furthermore, we quantify the inflation of evidence in favor of an effect in meta-analyses via the evidence inflation factor -the increase in Bayes factor in favor of the effect due to the PSB.We find that meta-analyses in economics inflate the evidence by a median factor of 11,369, whereas the meta-analyses in environmental sciences and psychology inflate the evidence by 'only' 45.9 and 30.0 respectively, and medicine by a median factor of 1.33.These extreme differences between the fields are largely driven by the disparity in the typical numbers of estimates per meta-analysis across the disciplines (Table 1).After standardizing the evidence inflation factor (sEIF) by the number of estimates per meta-analysis, we find that per estimate evidence inflation is the largest in psychology with a median factor of 1.27, Note.The width of grey area indicates density, the light grey area indicates the interquartile range, and the black line indicates the median.The y-axis is scaled according to posterior probabilities assuming equal prior probabilities of presence vs absence of the effect.See the secondary y-axis for Bayes factors in favor of the effect that are independent of the assumed prior probability of the effect.
followed by environmental sciences with a median factor of 1.22, economics with a median factor of 1.15, and medicine with a median factor of 1.05.Again, the strong evidence inflation in economics (11,369) is largely due to having many more estimates per meta-analysis than psychology and medicine.However, even after adjusting for different numbers, meta-analyses in medicine still show the least inflated evidence due to PSB.The results are based on the comparison of publication bias adjusted meta-analytic effect size estimates assuming presence of the effect to the mean effect sizes per meta-analysis.The table displays means and 95% confidence intervals.(See Table 4 in the Supplementary Materials for medians and interquartile ranges.)Table 2 summarizes the effect of PSB on effect sizes in each field.The first column reveals that environmental sciences, on average, suffer from as much as two and a half times larger absolute bias as medicine, economics, or psychology.The degree of absolute bias in environmental sciences is so large that it is comparable to average unadjusted effect sizes in other fields.Otherwise, medicine, psychology, and economics share a comparable degree of absolute bias.The median absolute bias in each field is lower than the mean bias due to the right skew distribution of absolute biases (see Table 4 in the Supplementary Materials).

Effect Size Estimates
The second column of Table 2 displays the relative impact of PSB on meta-analytic estimate via the overestimation factor.On average, economics meta-analyses are, relatively, the most PSB exaggerated, inflating effect sizes by over two times; this corroborates a prior survey on power and bias [10].Effect sizes in environmental sciences and medical meta-analyses show smaller yet notable relative effect size inflation.Finally, effect sizes in psychological meta-analyses are the least inflated with the average effect size exaggerated by 40%.In each field, the distribution of absolute biases is right-skewed; consequently, the per meta-analysis overestimation factor median is lower than the mean (see Table 4 in the Supplementary Materials).The median overestimation factor is relatively stable/decreasing with the increasing Footprint of Publication Selection Bias number of effect size estimates per meta-analysis, suggesting that the number of meta-analyses does not play a role in the relative size of PSB. Figure 2 shows medians, interquartile ranges, and distributions of the posterior probability of the PSB in each field.We find the most evidence for PSB in economics, where the typical evidence of the presence of publication bias is moderate, median BF PSB = 7.27 corresponding to 87.9% posterior probability of publication selection bias.Metaanalyses in environmental sciences and psychology have weak evidence in favor of PSB; even though the proportion of meta-analyses showing at least moderate evidence for PSB is still considerable (32.2% and 27.4% respectively).Meta-analyses in medicine show the lowest proportion of at least moderate evidence in favor of PSB (12.9%).However, the proportion of meta-analyses consistent with the evidence of absence of PSB (i.e., BF PSB < 1/3) is also the lowest in medicine (2.6%), indicating that the majority of medical meta-analyses is not informative enough to provide compelling evidence for or against publication bias.The proportion of meta-analyses with at least moderate evidence against PSB is somewhat higher in psychology (7.1%), economics (12.2%), and environmental sciences (12.6%).Meta-analyses with a larger number of effect size estimates present slightly more evidence in favor of the PSB; however, the overall disparity between the fields remains when comparing meta-analyses with a matched number of effect size estimates (Table 17 in Supplementary Materials).

Concluding Comments
We present a comprehensive assessment of publication selection bias and its effects on meta-analyses across medicine, environmental sciences, psychology, and economics.Novel methods and measures allowed us to quantify the evidence for the absence or presence of the mean effect and publication selection bias, as well as inflation of the evidence of the effect due to the publication selection bias.Furthermore, we estimated the bias and overestimation factor of the effect sizes of average estimates included in meta-analyses.
Our analysis is based on all effect size estimates found in these meta-analyses, regardless of the type outcome or how they were analyzed.One can classify outcomes into three categories.First, some outcomes may have been pre-specified as being of primary interest to show a desirable effect (e.g., the effectiveness of a medication in reducing the risk of death).Second, some other outcomes are not pre-specified but may still be used to demonstrate some preferred outcome; thus, they may have larger analytical flexibility (e.g., using alternative measures of effectiveness) and thereby are potentially more affected by publication selection bias.Third, still other outcomes may have been collected and analyzed without any strong interest to show some significant result, or even with some incentive to show non-significant results (e.g., outcomes on collected adverse events).Publication selection bias is expected mostly in the second category, while it may be less in the first category [38] and may be entirely absent in the third category.
Furthermore, we assumed independence of the reported primary estimates within and between meta-analyses; that is, each reported estimate is regarded to provide the same amount of new information as every other reported estimate.However, estimates may be dependent between meta-analyses, e.g., a single estimate might be used across multiple meta-analyses, and within meta-analyses, e.g., multiple estimates obtained from a single study/primary data set.As our data does not allow us to tackle those dependencies directly, we discuss how each independence violation might affect the results.The between meta-analysis dependency of estimates is of lesser importance as our inferences are concerned with the population of meta-analyses.Consequently, between meta-analysis dependency of estimates would only affect descriptive summaries of the estimates themselves.The within meta-analysis dependency of estimates is more problematic and can lead to 1) the overestimation of the strength of evidence, as the same primary data set is conditioned upon more than once, and 2) placing more weight on studies with multiple estimates.The first issue is partially mitigated via the standardized evidence inflation factor, which assesses the average evidence contribution of an estimate, i.e., adjusting for the number of data sets conditioned upon.However, the absolute measures of evidence (i.e., evidence for the presence of the effect before and after publication bias adjustment or the evidence for publication bias) can be susceptible to overestimation, particularly in fields with relatively large within meta-analysis dependencies such as economics or environmental science (but see Supplementary Materials for comparison of meta-analyses with a matched number of effect size estimates).The second issue cannot be directly addressed; however, all presented measures are based on comparisons of two sets of models, both of which should be affected to a similar degree, thus hopefully canceling most of the bias that is generated by overweighting studies with multiple estimates.Overall, we cannot exclude that the observed between-field differences may at least partially result from systematic differences in how meta-analyses themselves are conducted.
The milder publication selection bias in medical meta-analyses corroborates previous findings and might have multiple concurring explanations [9,16,19].First, as in other disciplines, a large share of those medical meta-analyses with seemingly strong evidence no longer had strong evidence when PSB adjustment was made.However, a much lower proportion of medical meta-analyses showed strong evidence of an effect compared to the other disciplines.Therefore, the difference between medicine and the other disciplines might be explained by the higher proportion of meta-analyses in medicine that showed weak evidence for an effect already before adjusting for publication selection bias.Second, medical studies may measure phenomena that are simpler and more stable, using methods that are more solidly and universally codified, which reduces researchers' "degrees of freedom" in generating and publishing evidence [9,16].Third, it is also possible that the milder publication selection bias seen in medical meta-analyses is reflecting a larger share of meta-analyses that belonged to a category of outcomes with less pressure for publication selection bias.Finally, medical research makes wider use of research integrity practices, such as clinical trial registration, which might reduce the risk of publication selection bias [39].Perhaps, medical research is, therefore, typically of a higher methodological quality and less subject to bias [9].
In this paper, we documented the considerable impact of publication selection bias on meta-analyses in a variety of disciplines.Even though we can probe the footprint of these biases with the statistical techniques employed here, science ultimately needs to progress toward mitigating publication bias already while conducting and publishing the research.While the specific patterns of researchers' "degrees of freedom" and causes of publication selection bias are likely to vary widely across fields, our results suggest that the social sciences might especially benefit from adopting practices to mitigate these, including: preregistration, greater transparency, and registered reports [40,41,42].

Highlights
-Publication selection bias, where studies with significant or positive results are more likely to be reported and published, distorts the available scientific record.
-This study surveyed over 68,000 meta-analyses from medicine, environmental sciences, psychology, and economics to assess the extent of publication selection bias.As a result, it underscores the importance of addressing publication bias in evidence synthesis -Results suggest that meta-analyses in economics are the most affected by publication selection bias, followed by environmental sciences and psychology.In contrast, meta-analyses in medicine are suggested to be the least affected.Yet, notable biases are found across all of these scientific disciplines.
-This study documents the potential extent of publication bias in different fields, which could help researchers and the public better understand the limitations of research and the potential biases of research synthesis.

Additional Results
Table 1: Proportion of statistically significant meta-analytic effect size estimates based on random effects meta-analysis estimated via restricted maximum likelihood for meta-analyses with different number of studies.3: Means and 95% confidence intervals for the evidence inflation factors (EIF) and standardized evidence inflation factors (sEIF) in each field.Two economics and three medical meta-analysis were omited from the computation as the evidence inflation factor was larger than the numerical precision of R.   Note.The width of grey area indicates density, the light grey area indicates the interquartile range, and the black line indicates the median.The y-axe is scaled according to posterior probabilities assuming equal prior probabilities of presence vs. absence of the type of publication selection bias.See the secondary y-axis for Bayes factors in favor of the effect that are not dependent on the assumed prior probabilities.(We display only a random sample of 5,000 meta-analyses from medicine to enhance clarity.)

Field
Figure 1 presents results regarding the mode in which PSB operates.Most meta-analyses provide slightly more evidence in favor of selection models compared to effect size inflation due to small studies, with the posterior probability of selection models assuming the presence of PSB in medicine (median/mean = 62.9/59.8%),economics (median/mean = 67.9/56.9%),psychology (median/mean = 60.5/59.7%),and environmental sciences (median/mean = 66.8/62.3%).Note.Full lines correspond to medians with the shaded areas depicting interquartile ranges.
Figure 2 shows how the PSB affects the probability of the individual studies being published based on their p-values (A) and the expected effect size inflation in small studies (B).Visualization of the median and interquartile range of mean relative publication probabilities in each field (i.e., the relative probability of non-significant p-values to be published in comparison to significant p-values) shows us comparable median relative publication probabilities for all fields, however, we find lower relative publication probabilities for the lower quartiles in psychology and especially economics.Visualization of the median and interquartile range of effect size inflation as a function of sample size of the primary studies in each field provides sligtly different results.We see a slightly steeper increase in bias for very small studies in environmental sciences and psychology in comparison to economics and medicine, however, the upper quartiles of the bias is, again, larger in economics than psychology, medicine or environmental sciences.

Figure 1 :
Figure 1: Median, Interquartile Range, and Distribution of Posterior Probability for the Presence of the Effect Before and After Adjustment for Publication Selection Bias in Each Field

Figure 2 :
Figure 2: Median, Interquartile Range, and Distribution of Posterior Probability for the Presence of Publication Selection Bias in Each Field

Figure 1 : 10 ∞
Figure 1: Distribution of posterior probability for the publication selection bias (A), different types of the publication bias assuming it is present (B), the relative publication probabilities for individual studies (C), and effect size inflation of individual studies as a function of sample size (D) for each field.

Figure 2 :
Figure 2: The relative publication probabilities for individual studies (A), and effect size inflation of individual studies as a function of sample size (B) for each field.

Table 2 :
Summary of the footprints of publication selection bias on the meta-analytic effect sizes in the form of absolute bias (in Cohen's d) and overestimation factor.

Table 2 :
Means and 95% confidence intervals of the posterior probability of an effect before and after adjusting for PSB, and the posterior probability of PSB.

Table 5 :
Means and 95% CI of absolute bias (in Cohen's d) and overestimation factor model-averaged across models assuming absence of the effect.

Table 6 :
Medians and interquartile ranges of absolute bias (in Cohen's d) and per meta-analysis overestimation factor model-averaged across models assuming absence of the effect.

Table 7 :
Median and interquartile range of mean effect size for meta-analyses with different number of studies (in Cohen's d).

Table 8 :
Mean and 95% CI of mean effect size for meta-analyses with different number of studies (in Cohen's d).

Table 9 :
Median and interquartile range of posterior probability of the presence of the effect based on publication bias unadjusted Bayesian model-averaged meta-analysis for meta-analyses with different number of studies.

Table 10 :
Mean and 95% CI of posterior probability of the presence of the effect based on publication bias unadjusted Bayesian model-averaged meta-analysis for meta-analyses with different number of studies.

Table 11 :
Median and interquartile range of standardized evidence inflation for for meta-analyses with different number of studies.

Table 12 :
Mean and 95% CI of standardized evidence inflation for for meta-analyses with different number of studies.

Table 13 :
Median and interquartile range of bias of mean effect sizes for meta-analyses with different number of studies (in Cohen's d).

Table 14 :
Mean and 95% CI of bias of mean effect sizes for meta-analyses with different number of studies (in Cohen's d).

Table 15 :
Median and interquartile range of overestimation factor of mean effect sizes for meta-analyses with different number of studies.

Table 16 :
Mean and 95% CI of overestimation factor of mean effect sizes for meta-analyses with different number of studies.

Table 17 :
Median and interquartile range of posterior probability of the presence of publication selection bias for meta-analyses with different number of studies.

Table 18 :
Mean and 95% CI of posterior probability of the presence of publication selection bias for meta-analyses with different number of studies.