Meta‐analysis and partial correlation coefficients: A matter of weights

This study builds on the simulation framework of a recent paper by Stanley and Doucouliagos (Research Synthesis Methods 2023;14;515–519). S&D use simulations to make the argument that meta‐analyses using partial correlation coefficients (PCCs) should employ a “suboptimal” estimator of the PCC standard error when constructing weights for fixed effect and random effects estimation. We address concerns that their simulations and subsequent recommendation may give meta‐analysts a misleading impression. While the estimator they promote dominates the “correct” formula in their Monte Carlo framework, there are other estimators that perform even better. We conclude that more research is needed before best practice recommendations can be made for meta‐analyses with PCCs.


Highlights
What is already known?• Inverse-variance weighting using the estimated variance of sample partial correlation coefficients (PCCs) introduces bias.• Simulations demonstrate that a "suboptimal" estimate of the sampling variance of the PCC estimator produces better results than the "correct" estimate.• These simulations are used to recommend the "suboptimal" estimator when conducting meta-analyses with PCCs.
What is new?
• We explain why inverse variance weighting that uses a "suboptimal" estimator of PCC's sampling variance can perform better in meta-analyses than one that uses the "correct" estimator of sampling variance.• We show that other estimators can outperform S&D's recommended estimator, even within their own simulation environment.• There is insufficient evidence to support best practice recommendations for meta-analyses with PCCs.
Potential impact for Review Synthesis Methods readers outside the authors' field • While the analysis here focuses on PCC, similar issues arise whenever the variance of the estimated effect is a function of the effect size, such as Cohen's d and log Odds Ratios.• Until further research establishes best practice recommendations, meta-analysts using these effects should employ a variety of approaches to determine robustness.

| INTRODUCTION
In a recent paper, Stanley and Doucouliagos, 1 henceforth S&D, argue that meta-analysts should never use correct standard errors when performing meta-analyses with partial correlation coefficients (PCCs).They present simulations that demonstrate that an alternative, "suboptimal" estimator of the standard error commonly used in the economics meta-analysis literature, statistically dominates the "correct" estimator when using either random effects, fixed effects, or unrestricted weighted least squares (UWLS).They recommend its use when the meta-analysis sample is relatively large and the population value of PCC is relatively small.In this paper, we confirm S&D's results but argue that their simulations and recommendation may give metaanalysts a misleading impression.We show that S&D's "suboptimal" estimator of the PCC standard error is itself dominated by other estimators.In S&D's simulation environment, OLS produces better results.In more realistic simulation environments, we show that other estimators dominate, and we explain why.
We proceed as follows.Section 2 describes the research design for S&D's Monte Carlo experiments.Section 3 demonstrates that we are able to reproduce their results.Section 4 notes that the data generating process (DGP) in S&D's Monte Carlo experiments assumes homoskedasticity and effect homogeneity.In this setting, OLS is the optimal estimator and we show that its performance dominates S&D's recommended estimator.
Section 5 extends their simulation framework to allow for heteroskedasticity and effect heterogeneity.It also expands the set of estimators to include the UWLS estimator preferred by S&D.Section 6 presents the results of the associated Monte Carlo experiments.We use these additional simulations to explain why the "suboptimal" estimator produces better results than the "correct" estimator.This also allows us to identify other estimators that produce even better results.Both a random effects version of their preferred estimator and a "smooth" estimator dominate S&D's recommended estimator.Section 7 concludes by arguing that there is insufficient understanding of the PCC problem at this time to support a best practice recommendation for meta-analyses with PCCs.Further research is needed.

| S&D'S RESEARCH DESIGN
S&D's research design consists of two stages.In the first stage, they generate 50 primary studies, each having an equal number of observations (either 25, 50, 100, 200, or 400 observations).Each primary study is described by the following DGP, where we generally try to maintain S&D's notation to facilitate comparison with their paper: where, where β 1 is the effect of interest.The corresponding t value is converted to a PCC using Equation (2): where, df ¼ n À 3. S&D use par to control the size of the t-statistic associated with b β 1 , and hence the size of r p .The three values of par (=1, 3, 9) correspond to asymptotic population mean values of r p equal to 0.7071, 0.3162, and 0.1104, respectively.This process is repeated until a meta-analysis sample of 50 PCC values is collected.
In the second stage, a DerSimonian and Laird random effects estimator (RE) is used to estimate ρ.Two different weights are used.One is based on the "correct" variance of r p as recently demonstrated by Aert and Goos. 2 It is "correct" because this is the true sampling variance of r p .Accordingly, one would expect that inverse variance weighting using the "correct" variance should produce optimal estimates.
The other weight uses an incorrect estimate of the variance of r p that is commonly employed in the economics meta-analysis literature 1,3 : S&D refer to this estimate as the "suboptimal" estimate.S 2 2 differs from the "correct" variance in that its numerator is the square root of the numerator in S 2 1 .S 2 2 is the correct sampling variance of r p in the special case when ρ ¼ 0. 4,5 For each experiment, two RE estimates of ρ are produced, one based on S 2 1 and one based on S 2 2 .S&D follow the two-stage process above and simulate 10,000 meta-analyses for each of 15 experiments corresponding to the different combinations of ρ 0:7071,0:3162,0:1104 f gand n 25,50,100,200,400 f Þ. S&D's surprising result is that inverse variance weighting using the "suboptimal" estimator produces superior results to the "correct" estimator.

| REPLICATION OF S&D
Columns (3), ( 4), (7), (8), (11) and (12) of Table 1 copies the results from S&D's paper.The table is divided into three panels, reporting results for Bias, RMSE, and Coverage.Across the board, using the suboptimal weights based on S 2 2 results in lower Bias, smaller RMSE, and Coverage rates closer to 95%.
For example, when the true value of PCC, ρ, equals 0.7071 and all the primary studies in the meta-analysis sample have 25 observations, the RE estimator based on S 2 2 produces estimates that have an average Bias of 0.0235, an average RMSE of 0.0279, and an average Coverage rate of 84.90%.In contrast, the "correct" estimator, S 2 1 , produces estimates that have an average Bias of 0.0455, an average RMSE of 0.0479, and an average Coverage rate of 14.27%.The statistical dominance of S 2 2 is true for every experiment in the table.2), ( 3), ( 4), ( 7), ( 8), (11), and ( 12) are reproduced from tab. 1 in Stanley and Doucouliagos. 1 They come from random effects estimates of ρ using PCC sampling variances S 2 1 and S 2 2 .Columns ( 5), ( 6), ( 9), ( 10), (13), and ( 14) are replications of S&D's results using a multi-variate normal data-generating process described in Section 3. The table demonstrates that S 2 2 is superior to S 2 1 .
In the first stage of our analysis, we follow S&D's simulation design and confirm that we are able to exactly replicate their results aside from minor Monte Carlo error.Those results are reported in the online appendix.However, following comments from a reviewer, we also replicate S&D's results by simulating Y , x 1 , and x 2 from a multivariate normal distribution in which the population PCC value, ρ, is set directly.The advantage of this approach is that it allows us to later introduce heterogeneity directly in ρ consistent with the RE model.The following DGP is sufficient to replicate S&D's results: The corresponding results are reported in Columns ( 5), ( 6), ( 9), ( 10), (13), and ( 14), placed side-by-side to S&D's results.The Bias and RMSE results are virtually identical.The Coverage rates are slightly different but still demonstrate that S 2 2 has superior coverage properties compared to S 2 1 .

| S&D'S RESEARCH DESIGN IS NOT WELL SUITED FOR THEIR EXPERIMENT
The observant reader might have noticed in the previous description of S&D's research design that not only is the population value of β 1 homogeneous across primary studies, but the error terms all have the same variance.Since all primary studies included in a given meta-analysis have an equal number of observations, they will also have the same population values of s:e b β 1 .In that case, when sampling variances are unknown and must be estimated, neither RE, FE, or UWLS is efficient.The theoretically optimal estimator is OLS.We demonstrate this empirically in Table 2.
Columns (1) and ( 2) report details about the respective experiments.The first two columns of each of the next three sections reproduce the Bias, RMSE, and Coverage values from Table 1.But there is now a third column to the right of those columns reporting OLS estimates (see Columns 5, 8, and 11).As can be clearly seen, for each of the 15 experiments, OLS dominates the two RE estimators on Bias and RMSE.
T A B L E 2 S&D's results explained.

Research design
Bias RMSE Coverage CV  2), ( 3), ( 4), ( 6), ( 7), (9), and ( 10) are reproduced from the replication results in Table 1.The OLS results in Columns ( 5), (8), and ( 11) use the same meta-analysis datasets but estimate ρ from an OLS regression of r p on a constant term.Columns ( 12) and ( 13) report the coefficient of variation (CV) of S 1 and S 2 .The table demonstrates that unweighted, OLS regression is superior to weighted, random effects estimates and that the S 2 2 estimator is better than the S 2 1 estimator because it is closest to OLS.
We are now in a position to provide a first insight into S&D's results.As already noted, the optimal estimator given S&D's DGP is not RE, but OLS.The reason the RE estimator based on S 2 2 performs better is because it is closest to OLS.This is evident in the last two columns of Table 2.These two columns report the coefficient of variation (CV) for S 1 and S 2 , where CV = (standard deviation/mean) Â 100%.The CV for S 2 is approximately half that of S 1 .The CV average for S 1 is 8.8%, compared to 4.5% for S 2 .In other words, using the "suboptimal" estimate of the standard error produces weights that are more uniform than those using the "correct" estimate, and thus closer to the equal weights employed by OLS.

| A FAIRER TEST
In fact, S&D's simulations do not provide a fair test of the consequences of using the "suboptimal" estimator for PCC standard errors because their Monte Carlo data environments assume homoskedasticity and effect homogeneity in the primary studies.In this section we present results from additional simulations that build on S&D's research design but add heteroskedasticity and effect heterogeneity.We will demonstrate that the superior performance of the "suboptimal" estimator extends even to these more realistic environments.The more general data environments will also allow us to better understand what drives its superior performance.Further, it will allow us to identify other estimators that perform even better.
Case 1. Homoskedasticity and effect homogeneity in the primary studies.Case 1 is identical to the S&D simulations above.We focus on the case where all primary studies have 200 observations.Case 2. Heteroskedasticity and effect homogeneity in the primary studies.Case 2 introduces heteroskedasticity within meta-analyses by allowing primary studies to have differing numbers of observations.In particular, the sample size for each primary study is determined by a random draw from a uniform distribution of integers ranging from 25 to 400, n $ U 25,400 ð Þ: S&D restricted all primary studies for a given meta-analysis to have the same sample size.We mix primary studies with different sample sizes in the same metaanalysis.This induces heteroskedasticity, with estimates from primary studies having more observations being more precisely estimated.
Other than the mix in sample sizes, we follow the same data generation and estimation procedure as in Case 1.
Case 3. Heteroskedasticity and heterogeneity in the primary studies.Case 3 adds effect heterogeneity directly into the DGP of the primary studies.Specifically, As in Case 2, 50 primary studies are included in each meta-analysis with each study having observations equal to an integer randomly drawn from a uniform distribution taking values between 25 and 400.
The estimators.While S&D's featured results focused on a random effects estimator, they note that they obtained the same conclusion using a fixed effect (FE) estimator and an estimator they call UWLS, for unrestricted weighted least squares.UWLS produces coefficient estimates that are identical to FE, but makes an adjustment to the standard errors.In multiple papers, Stanley and Doucoucliagos promote the superior performance of the UWLS. 6,7For example, in a recent paper on medical research, they (along with coauthors) conclude: "UWLS frequently dominates RE in medical research, often substantially.Thus, the UWLS should be reported routinely in the meta-analysis of clinical trials." 6Accordingly, our additional simulations investigate the performance of the UWLS estimator.It should perform well in Case 2 when there is heteroskedasticity and effect homogeneity, since these are the conditions for which the FE estimator was developed. 8As in S&D, we will have two versions of the UWLS estimator, one using inverse variance weights 1 S 2 1 and one using 1 We also include the OLS estimator, since it is expected to perform well in Case 1 when there is homoskedasticity and effect homogeneity.And since Case 3 introduces effect heterogeneity, we also introduce a random effects version of Stanley and Doucouliagos' UWLS estimator.It uses inverse variance weights , where τ 2 measures heterogeneity in ρ.As Aert and Jackson show, this estimator is equivalent to the Hartung-Knapp method for random-effects meta-analysis and is thus a natural extension to S&D's UWLS estimator. 9To distinguish the two UWLS models, we refer to them as UWLS (FE) and UWLS (RE).
The experiments.Following S&D, we continue to use ρ values of 0.7071, 0.3162, and 0.1104.Given the three cases above, this yields nine experiments.Each experiment generates 10,000 simulated meta-analyses, and each meta-analysis consists of 50 primary studies.We calculate Bias, RMSE, and Coverage for each of the five estimators: OLS, UWLS (FE-S 2 1 ), UWLS (FE-S 2 2 ), UWLS (RE-S 2 1 ), and UWLS (RE-S 2 2 ), with one small change from before.The sample correlation coefficient is a biased estimator of the population correlation coefficient, with E r p Â Ã < ρ by a factor of ρ 1Àρ 2 ð Þ 2n . 10Accordingly, we incorporate this adjustment in calculating Bias, RMSE, and Coverage in simulations where meta-analyses consist of primary studies with the same number of observations.When primary studies differ in their sample sizes, we calculate E r p Â Ã by simulating 1,000,000 values of r p and taking the average.(We note that this was not done in the simulations of Tables 1 and 2 because these were replications of S&D, and they did not make this adjustment in their analysis.)Finally, as before, we report CVs for S 1 and S 2 to compare their variation.1 and 2. The main difference is that it compares performance for the following estimators: OLS, UWLS (FE-S 2 1 ), UWLS (FE-S 2 2 ), UWLS (RE-S 2 1 ), and UWLS (RE-S 2 2 ).The bluehighlighted column identifies S&D's preferred estimator, and the gray-highlighted rows indicate the data environments for which they recommend their preferred estimator.Yellow-highlighted cells indicate "best" results for the respective performance metric.This table confirms the finding that the S 2 2 estimators dominate the S 2 1 .However, OLS dominates the weighted estimators when primary studies are characterized by homoskedasticity and effect homogeneity.characterizes heterogeneity in ρ; τ 2 ¼ 0 indicates effect homogeneity.Columns (4) and ( 5) report the CV values for S 1 and S 2 , and Columns ( 6)-( 10) report the performance results for the respective estimators.Consistent with Table 2, S 2 1 and S 2 2 display relatively little variation, with coefficients of variation ≤10%, and CV(S 2 2 ) approximately half that of CV(S 2 1 ).In line with S&D's findings, UWLS (FE-S 2 2 ) is superior to UWLS (FE-S 2 1 ), and UWLS (RE-S 2 2 ) is superior to UWLS (RE-S 2 1 ) on the dimensions of Bias, RMSE, and Coverage.OLS, however, is superior to them all.For example, the average Bias for OLS is 0.0000 and the next closest is 0.0024 (cf.UWLS (RE-S 2 2 )).The average RMSE for OLS is 0.0080, and the next closest is 0.0086 (cf.UWLS (FE-S 2 2 ) and UWLS (RE-S 2 2 )).And the average coverage rate for OLS is 95.02%, and the next closest is 92.51% (UWLS (RE-S 2 2 )).The table uses highlighting to focus attention on three things.S&D specifically recommend the S 2 2 estimator when PCC is not very large: "We recommend using the SEs derived as the square root of S 2 2 from Equation (2) when the primary studies have n > 100 and the PCCs are not very large."Accordingly, the gray-highlighted row identifies the smallest of the three ρ values, ρ ¼ 0:1104.The blue-highlighted column identifies S&D's preferred UWLS estimator (UWLS (FE-S 2 2 )).And the yellowhighlighted cells identify the "best" performance results, where "best" is respectively defined as smallest bias, or smallest RMSE, or coverage rate closest to 95%.Even in the specific scenario for which S&D recommend the S 2 2 estimator, OLS is superior.

| Case 2
Table 4 reports results for Case 2, where primary studies are characterized by homogeneity in the effect size but heteroskedasticity in the estimated effects.The latter is reflected in the increased variation of S 1 and S 2 , as seen in Columns (4) and (5).The respective CV values increase from an average of 5.4% and 2.7% in Table 3, to 41.5% and 40.7% in Table 4.The cause of this increase is that within any given meta-analysis there is now a mix of primary studies with observations ranging from 25 to 400 observations.
Turning now to the results in Columns ( 6) through (10)-ignore Columns ( 11) and ( 12) for the moment-we T A B L E 4 Comparison of estimators given heteroskedasticity and homogeneity in the primary studies (Case 2).  3 except that the DGP of the primary studies are characterized by heteroskedasticity as described in Case 2 of Section 3. In addition to the estimators in Table 3, the table also reports estimates from two additional estimators: an unrestricted WLS Fixed Effect estimator, and an unrestricted WLS Random Effects estimator, where the respective inverse variance weights are based on the "smooth estimator" of Equation ( 7).The blue-highlighted column identifies S&D preferred estimator, and the gray-highlighted rows indicate the data environments for which they recommend their preferred estimator.Yellow-highlighted cells indicate "best" results for the respective performance metric.The table shows that the S 2 2 estimators continue to dominate the S 2 1 estimators when the data environment is generalized to include heteroskedasticity.A bias-variances trade-off is evident in that OLS is superior to the S 2 2 estimators on Bias while the S 2 2 estimators are superior on RMSE.However, the two UWLS (Smooth) estimators have lowest RMSE of all.
T A B L E 5 Comparison of estimators given heteroskedasticity and heterogeneity in the primary studies (Case 3).see that OLS dominates the UWLS estimators on Bias but the UWLS-S 2 2 estimators dominate on RMSE.For example, the average Bias for OLS is 0.0002.The next closest is 0.0026 for UWLS (RE-S 2 2 ).On the other hand, UWLS (FE-S 2 2 ) and UWLS (RE-S 2 2 ) dominate on RMSE with an average value of 0.0084.

| Bias-variance trade-off
It was easy to understand why S 2 2 did better than S 2 1 in the simulations underlying Tables 1-3 where OLS is unbiased and efficient.What is not obvious is why it generally does better in the heteroskedastic environment of Table 4. Weighting by the inverse of the "correct" variance should produce estimates that have least variance.In fact, it does.The problem lies with the bias.As S&D note, "all inverse-variance meta-analyses, whether S This bias-variance trade-off is clearly evident in the relative performances of the OLS and S 2 2 estimators in Table 4. OLS is unbiased in the homogeneous effect environment of Table 4.However, the presence of heteroskedasticity tips the balance towards the S 2 2 estimators because they have lower variance, so that they end up dominating on RMSE.
These experiments illustrate the difficulty of finding a best estimator for all situations.OLS made a superior bias-variance trade-off in Table 3, producing smallest RMSE.The S 2 2 estimator made a superior bias-variances trade-off in Table 4.But there may yet be other estimators that do better.We next provide such an example.
Define S as follows: S is called a "smooth estimator" because it uses averages. 11The only difference between Equation ( 7) and Equations (3) and ( 4) is that r p is replaced with its meta-analytic sample mean, r p .If we substitute S in UWLS (FE) and UWLS (RE), we get FE and RE versions of UWLS (Smooth).These are reported in Columns ( 11) and ( 12) of Table 4.
Both OLS and the two UWLS (Smooth) estimators are unbiased because their weights are unaffected by r p .This is reflected in the simulations, with the three estimators virtually tied on the dimension of Bias.However, the two UWLS (Smooth) estimators dominate on RMSE because they give greater weight to estimates that are more precise due to coming from regressions with more observations (cf.Equation 7).As a result, they have lower variance and a smaller RMSE.
Compared to the S 2 1 and S 2 2 estimators, the two UWLS (Smooth) estimators have greater variance.However, their advantage on Bias overcompensates for their deficiency on variance so that they have lower RMSE.They make a better bias-variance trade-off than the S 2 1 and S 2 2 estimators.

| Case 3
Table 5 reports results when primary studies are characterized by both heteroskedasticity and effect heterogeneity.As before, OLS and the smooth estimators are unbiased.However, the introduction of effect heterogeneity now allows the RE estimators to outperform their FE analogues.So UWLS (RE-S 2 1 ) performs better than UWLS (FE-S 2 1 ), UWLS (RE-S 2 2 ) performs better than UWLS (FE-S 2 ), and UWLS (Smooth-RE) performs better than UWLS (Smooth-FE).
However, the main takeaway from Table 5 is that the smooth estimators again make the best bias-variance tradeoff, with both smooth estimators dominating their respective analogues on RMSE: UWLS (Smooth-RE) < UWLS (RE-S 2 2 ) < UWLS (RE-S 2 1 ), and UWLS (Smooth-FE) < UWLS (FE-S 2 ) < UWLS (FE-S 2 1 ).To summarize, the reason S&D's "suboptimal" variance estimator produces better outcomes than the "correct" estimator is because it makes a better trade-off between bias and variance.However, it does not follow that it should be the meta-analyst's estimator of choice when working with PCCs.Even in the scenarios that S&D cite for their recommendationlarge sample size and small PCCthere are other estimators that perform better.

| CONCLUSION
This study builds on the simulation framework of a recent paper by Stanley and Doucouliagos. 1 S&D use simulations to support a recommendation that metaanalyses using partial correlation coefficients (PCCs) should employ a "suboptimal" estimator of the PCC's sample variance, denoted S 2 2 , when constructing weights for fixed effect and random effects estimation.While we confirm their simulation findings, their simulations and recommendation may give meta-analysts a misleading impression.
S 2 2 performs better than the "correct" estimator because it does a better job of trading variance for bias.However, as demonstrated in this study, other estimators, do an even better job, even within the simulation environments that S&D construct to demonstrate their claim.In fact, S&D's recommended estimator is never the "best" estimator on any dimension in any experiment.Thus S&D's findings should not be interpreted as an endorsement of the use of UWLS with S 2 2 in meta-analyses with PCCs.
Where does that leave us with recommendations for best practice?The simulations in this study identify the smooth estimator as having many desirable properties.However, we have not addressed its performance in the presence of publication bias; nor have we considered other alternatives, such as Fisher's z. 12,13 While the analysis here focuses on PCC, similar issues arise whenever the variance of the estimated effect is a function of the effect size, such as Cohen's d and log Odds Ratios.This is a topic that would benefit from further research.

Table 3
reports the results for the experiments corresponding to Case 1.These results should be similar to S&D's since the DGP for each experiment is characterized by homoskedasticity and effect homogeneity in the primary studies.Columns (1)-(3) report design details for the respective experiments, where we recall that τ 2 Note:The DGPs underlying the simulations in this table are identical to the ρ ¼ 0:7071,0:3162:0:1104 f g , N ¼ 200 replication experiments in Tables The simulations underlying this table are identical to those in Table4except that the DGP of the primary studies are characterized by effect heterogeneity as described by Equation(6)in the text.The bluehighlighted column identifies S&D preferred estimator, and the gray-highlighted rows indicate the data environments for which they recommend their preferred estimator.Yellow-highlighted cells indicate "best" results for the respective performance metric.The table shows that the S 2 2 estimators continue to dominate the S 2 1 estimators when the data environment is generalized further to include effect heterogeneity.It also shows that the random effects estimators dominate their fixed effect analogues.The UWLS (Smooth-RE) estimator is superior to all the other estimators on the dimension of RMSE. Note: