Corruption's Direct Effects on Per‐Capita Income Growth: A Meta‐Analysis

Corruption is a symptom of weak institutional quality and could have potentially adverse effects on economic growth. However, heterogeneity in reported findings makes it difficult to synthesize the evidence base with a view to test competing hypotheses and/or support evidence‐based policy and practice. To address this issue, we have extracted 327 estimates of corruption's direct effect on per‐capita GDP growth from 29 primary studies, following a peer‐reviewed and pre‐published systematic review protocol. Precision‐effect and funnel asymmetry tests indicate that corruption has a negative effect on per‐capita GDP growth after controlling for publication selection bias and within‐study dependence. However, multivariate meta‐regression analysis results indicate that the overall effect is not robust to inclusion of moderating variables through a general‐to‐specific procedure for model specification. We report that the marginal effect of corruption on per‐capita GDP growth is more adverse when the primary study estimates relate to long‐run growth, are based on low‐income‐country data only, and extracted from journal papers. The effect is less adverse in studies that use the International Country Risk Guide corruption perceptions index and in those reporting estimates from two‐stage least‐squares estimations.


Introduction
Corruption is an ancient problem, with which social scientists and policymakers have grappled for centuries (Bardhan, 1997). Nevertheless, our search results indicate that the increase in the number of studies on the relationship between corruption and growth is a recent phenomenon, with average hits increasing from about 10 per year in 1990 to 25 in mid-1990s and to more than 100 in 2009. The increased research effort coincides with heightened policy concerns that liberalization reforms in the post-cold-war period were not delivering the expected growth benefits because of institutional weaknesses in general and prevalence of corruption in particular. Hence, it appears that the earlier emphasis on corruption as a by-product of regulation and government intervention (Huntington, 1968) have been overtaken by a new debate on how corruption may coexist with market reforms, posing a new challenge for policy and practice.
The empirical work on corruption and growth combines the institutional approach to economic performance (North, 1990(North, , 1994 with the empirics of growth literature (Barro, 1991;Levine and Renelt, 1991;Mankiw et al., 1992). The findings are heterogeneous not only because of different measures of corruption and growth, but also different estimation methods, country coverage, and sample periods. Hence, policymakers and researchers alike find it difficult to derive verifiable conclusions about corruption's effect on growth. This paper aims to bridge the evidence gap by conducting a systematic review 1 of the literature and providing synthesized evidence based on meta-analysis methods.
The paper is organised in five sections. Section 2 provides a brief overview of the empirical growth model that informs the estimation strategies of the included primary studies. Section 3 summarizes the systematic review methodology that informs our search strategy, inclusion and exclusion criteria, and data extraction/coding procedures. Section 4 introduces the meta-analysis tools/methods and the findings. Finally, Section 5 provides a summary of the findings and derives some policy and research conclusions.

Corruption and Growth: A Brief Overview
Studies reviewed here use cross-section or panel data to estimate empirical growth models that follow Barro (1991) and Mankiw et al. (1992). In these models, growth depends on initial level of per-capita income, technical efficiency and investment in physical and human capital. Another feature of the empirical growth models is that technical efficiency grows at the same rate of (g) across countries, although its initial level varies between countries because of country-specific factors such as geography. Later empirical work introduces institutional quality, openness to trade or financial development as additional factors that influence technological efficiency. Hence, model building in the empirical growth literature consists of two components: a formal component that draws upon a Cobb-Douglas production function augmented with human capital; and an informal component driven mainly by empirical innovations (Temple, 1999). The primary studies included in this review innovate within a Cobb-Douglas production function with human capital and constant returns to scale, which is given in (1) below.
Here, y it is growth rate of per-capita GDP, s k and s h are rates of investment in physical and human capital, n is rate of population growth, g is growth rate of technical efficiency, δ is rate of capital depreciation, γ 1 , γ 2 and γ 3 are coefficients to be estimated, v i is country fixed effect and u it is the error term. This model has been estimated by various contributors, including Levine and Renelt (1991), Mankiw et al. (1992), Sachs and Warner (1997); and Gyimah-Brempong and Traynor (1999). Researchers who estimate the direct effect of corruption on growth innovates within this model by adding two types of variables: (1) a perceptions-based measure of corruption; and (2) other variables reported to be relevant in previous empirical work, such as financial development, openness to trade, foreign direct investment, etc. Hence, a generic form of the model used in primary studies can be stated as follows: Here, Corr is the corruption variable and its coefficient (θ ) is the effect-size estimate reported in primary studies. CV f is the vector of the variables included in the formal model whilst CV k is the vector of control variables that reflect innovations related to trade openness, financial development, etc. The meta-analysis in this paper is based on coefficient estimates that measure corruption's direct effects on per-capita GDP growth -namely (θ ). The focus on direct effects only is because of small number of studies that estimate indirect effects (three in total) and the difficulty in pooling together the indirect effects that unfold via three different channels: human capital, public finance and investment.
Among meta-analysis studies on growth, Abreu et al. (2005) try to establish the magnitude of the convergence rate. Doucouliagos (2005) and Efendic et al. (2011), respectively, meta-analyse the effect of economic freedom and institutional quality indices on growth. These studies report that the rate of convergence and the effect size are larger in magnitude when the bias resulting from unobserved heterogeneity is controlled for. Therefore, we control for heterogeneity through multivariate meta-regression analysis (MRA). We derive the MRA model through a general-to-specific procedure, which involves exclusion of insignificant covariates with the largest p values one at a time until all remaining covariates are significant (Charemza and Deadman, 1997;Doucouliagos and Stanley, 2009). This procedure is preferred because of its tractability and its effectiveness in addressing the issue of overdetermination in the MRA model to be estimated. Huntington (1968) argued that corruption may 'grease in the wheel' by enabling economic agents to engage in beneficial activities that may be otherwise unfeasible because of high levels of bureaucratic hold-ups in highly-regulated countries. However, Myrdal (1968) demonstrated that the hold-ups that private agents try to circumvent through corruption and the level of corruption itself should be considered as symptoms of underlying institutional weaknesses that enable the political elite to maximise bribes revenue by increasing the level of administrative hold-ups. North's (1990North's ( , 1994 seminal contributions on institutions and economic performance appears to have vindicated Myrdal's critique and have informed a wide range of studies on the relationship between corruption and economic performance. Initially, the new work focused on microeconomic efficiency, investigating the effects of corruption on entrepreneurial skills and technology choice. This work became prominent in the 1980s and early 1990s; and is reviewed in Svensson (2005). The focus shifted to the macrolevel with . Although  did not find a significant relationship between corruption and growth, Mauro (1997) used a larger dataset and concluded that the effect of corruption on investment and per-capita income growth was negative and statistically significant. A one-standarddeviation improvement in the control of corruption is found to be associated with a 4 percentage point increase in investment rate and a 0.5 percentage point increase in per-capita income growth per annum. Mo (2001) and  lend support to Mauro's findings, which indicate that corruption is detrimental to economic growth. Bardhan (1997) is the first review of the work on both micro-and macrolevel effects of corruption and offers two conclusions. First, the static effects of corruption on efficiency can be positive or negative, depending on the severity of the policy-induced distortions and organisation of corruption (i.e. whether corruption is centralised or decentralised). Secondly, the effect of corruption on growth depends on how it affects the composition of investment, the quality of public finance and the incentives to invest in physical and human capital. Although Bardhan (1997) suggests that corruption's effect on growth is likely to be negative, he also indicates that this conclusion is based on historical experience rather than contemporary empirical research, which was at its infancy at the time. Two years later, Wei (1999) reported an increase in the number of empirical studies and concluded that corruption's adverse effect on growth results from reduced domestic investment, discouraged foreign direct investment, overspending in government and distorted composition of government spending. More recently,  and Méon and Weill (2010) have reported more nuanced findings. They indicate that corruption has regime-specific effects on growth, with weaker effects in countries with weak institutions.
To sum up, the existing literature tends to indicate that the effect of corruption on growth is likely to be negative, but would vary between countries and types of corruption. This variation is highly likely to be augmented by other moderating factors that characterise the research field, including differences in estimation methods, data structure, model specifications and corruption measures used. In this context, meta-analysis offers objective and verifiable means for synthesizing the evidence base. To ensure the objectivity and replicability of the in this paper, we follow a peer-reviewed and pre-published systematic review protocol that specifies the search, inclusion/exclusion and data extraction criteria.

Systematic Review Methodology
A systematic review of the corruption-growth relationship poses specific challenges because corruption is an illicit practice and as such it is difficult to measure. Also, corruption may involve different practices in different countries. To address these challenges, we adopt a principal-agent definition of corruption as suggested by Groenendijk (1997). In this definition, the agent (usually an appointed or elected public official) abuses his/her public authority and imposes a surcharge on the delivery of a service to a principal, who may be a natural or legal entity unable to hold the principal to account because of high monitoring costs.
The principal-agent definition is inclusive enough to accommodate different types of corruption, but it is by no means a universally agreed one (see Philp, 2006;Banerjee et al. 2012). We adopt the principalagent definition because it captures bureaucratic or political corruption, bribes and nepotism or frauds and embezzlement. It also includes practices that distort the function of the public office, whether these practices are formally illegal or not. A blanket exclusion of acts that are not banned legally, as proposed by Philp (2006) and Banerjee et al. (2012), may entail underestimating the level of corruption and the endogenous nature of the legal provisions that differentiate between corrupt and noncorrupt behaviour.
Indeed the corruption data sources used in the primary studies do not distinguish between legal and illegal corruption; and try to capture corrupt practices that fit well with the principal-agent definition. For example, the International Country Risk Guide (ICRG) index is constructed to capture the perceptions of respondents with respect to 'actual or potential corruption in the form of excessive patronage, nepotism, job reservations, 'favour-for-favours', secret party funding, and suspiciously close ties between politics and business'. The Transparency International (TI) index, on the other hand, aims to capture 'information about the administrative and political aspects of corruption', through questions related to 'bribery of public officials, kickbacks in public procurement, embezzlement of public funds' and questions that 'probe the strength and effectiveness of public sector anti-corruption efforts'. Finally, the World Governance Indicators (WGI) scores aim to capture 'perceptions of the extent to which public power is exercised for private gain, including both petty and grand forms of corruption, as well as 'capture' of the state by elites and private interests.' 2 Our systematic review methodology draws on guidelines proposed by the Centre for Reviews and Dissemination of the University of York (CRD, 2009), which reflects best practice in systematic reviews registered with Campbell and Cochrane Collaborations. We searched in 20 electronic databases for journal papers, working papers, reports and PhD theses; using 32 keywords for corruption, growth, developing countries, and 43 low-income-country (LIC) names. 3 The search produced 1,042 hits, the title/abstract screening of which was carried out on the basis of PIOS (Population -Independent variable -Outcome -Study design) criteria. 4 The title/abstract screening led to selection of 338 studies for critical evaluation. In this second stage, we evaluated the studies on the basis of full-text information using validity, reliability and applicability (VRA) criteria. The critical evaluation led to inclusion of 40 empirical studies, of which 32 studies reported effects on per-capita GDP growth. The remaining 8 studies reported effects on GDP or per-capita GDP levels. We extracted data from all included studies (40 studies), but the data on the relationship between corruption and GDP and per-capita GDP levels is not used in this meta-analysis for two reasons. First, we wanted to focus only on corruption's effect on per-capita GDP growth, which is the outcome variable in model (2) above. Secondly, the evidence base on the relationship between corruption and income levels is small.
Of the included studies, only three estimate the indirect effects of corruption on per-capita GDP growthi.e., the effects of corruption on growth when corruption is interacted with investment, public finance and human capital. We have excluded these estimates from the meta-analysis too because they refer to three different transmission channels that cannot be aggregated. Hence, our focus is only on the direct effects of corruption on per-capita GDP growth reported in 29 primary studies.
Primary studies draw upon four main sources of corruption data, which consists of scores based on perceived levels of corruption. The scores have different scales, which range from 0 to 6 for ICRG data; from −2.5 to +2.5 for WGI data; from 0 to 12 for TI data; and different ranges in Other corruption data sources. 5 With the exception of TI data, an increase in the score refers to less corruption. To ensure consistency, most of the original studies transform the corruption indices such that the scale is kept the same but an increase in the score is made to reflect increased corruption. In the minority of cases where such transformation is not carried out, we have coded the reported estimates as 'not transformed' and we multiplied the reported estimates with (−1) to ensure sign compatibility.
For meta-analysis, we have extracted all direct-effect estimates reported in included studies. The alternative would have been to choose a summary statistic (for example the average or median) for each study or a single estimate chosen on the basis of significance or sample size. Such alternatives, however, would have two major shortcomings. First, they would prevent the use of all available information. Secondly, the selection criterion is likely to have a subjective dimension that could bias the results or at least reduce the scope for comparability or replicability of the findings in different meta-analysis studies (de Dominicis et al., 2008;Stanley, 2008;and Stanley and Doucouliagos, 2009).

Meta-analysis Tools and Methods
We first calculate partial correlation coefficients (PCCs), which measure the association between corruption and per-capita GDP growth whereas other explanatory variables are held constant. PCCs are comparable across studies as they are independent of the metrics with which the independent and dependent variables are measured. An alternative measure would have been elasticities, which are also comparable across studies and measure the effect size in an economically more meaningful manner. However, primary studies do not provide sufficient information necessary to calculate elasticities. Therefore, partial correlations are used extensively in meta-analysis (see Doucouliagos and Ulubasoglu, 2008;Doucouliagos and Laroche, 2009).
Secondly, we calculate fixed-effect weighted means of the PCCs to provide a descriptive summary of the evidence reported by each study. Third, we provide funnel plots for visual inspection of publication selection bias. In the fourth stage, we conduct precision-effect and funnel-asymmetry tests (PETs and FATs) to quantify the publication selection bias and to verify if a genuine effect exists after controlling for publication selection. In the fifth stage, we conduct multivariate MRA by modelling the sources of variation explicitly. By combining results from a number of studies and regressing the estimates on moderating factors that characterise the research field, the MRA provides more reliable effect estimates than individual studies themselves or group of studies (Stanley and Jarrell, 1989;Kulinskaya et al., 2008).

Estimators and Models
The PCC (r i ) for each effect-size estimate and its standard errors (se ri ) are calculated in accordance with (3) and (4) below. and Here, t i and df i are t-statistic and degrees of freedom associated with effect-size estimates reported in primary studies. The standard error (se ri ) represents the variation because of sampling error and the inverse of its square is used as weights to calculate fixed-effect estimators (FEEs) for study-based weighted means. The latter takes account of within-study variation by assigning lower weights to less precise estimates. Weighted means are more reliable than simple means; but they cannot be considered as genuine measures of effect size if primary-study estimates are subject to selection bias and/or affected by within-study dependence between reported estimates (de Dominicis et al., 2008).
In the next step, we conduct PET-FAT analysis to establish whether the effect-size estimates are subject to publication selection bias and whether they represent genuine effects beyond bias. The PET involves estimating a weighted least square (WLS) bivariate model used widely in the literature (see, for example, Abreu et al., 2005;Stanley, 2005;Stanley and Doucouliagos, 2007;Stanley, 2008;and Efendic et al., 2011). Stanley (2008) demonstrates that model (4) below can be used to test for both funnel asymmetry (publication selection bias) and for genuine effect beyond selection bias.
Here t i and se ri , respectively, are the test statistic reported by primary studies and the standard error of the PCC as specified in (4) above. The FAT involves testing for α 0 = 0, whilst the PET tests for β 0 = 0. The FAT is known to have low power -i.e., low probability of rejecting the null hypothesis when the latter is actually false. Against this weakness, model (4) has the advantage of testing for genuine effect when publication selection bias is controlled for. Doucouliagos andStanley (2009, 2013) indicate that the selection bias should be considered as substantial if |α 0 | ≥ 1 and as severe if |α 0 | ≥ 2. Furthermore, Stanley and Doucouliagos (2007;2012, chapter 4) indicate that the reported effect-size and its standard error have a nonlinear relationship if the PET indicates the existence of genuine effect. In such cases, they propose a precision-effect estimation with standard errors (PEESE) to obtain a corrected estimate of β 0 . The PEESE model can be stated as follows: Dividing both sides by (se ri ) to address heteroskedasticity, we obtain model (7) -which must be estimated by suppressing the constant term.
We estimate models (5) and (7) for 3 country types: LICs, mixed-income countries (Mixed) and all countries (All). The PET-FAT-PEESE estimations allow for making inference about the existence or absence of genuine effect for the typical study, but they assume that the moderating variables that may be structurally related to study characteristics are equal to their sample means and independent of SE. Therefore, we also carry out a multivariate MRA to establish the extent to which differences in the moderating variables account for variations in reported effect-size estimates and the verify if the association between corruption and per-capita growth remains robust to inclusion of moderating variables into the model. The MRA specification follows Stanley (2008), Doucouliagos and Ulubasoglu (2008), and Efendic et al. (2011); and can be stated as follows: As before, (1/se ri ) is precision, Z ji is a vector of binary variables that may account for variation in the evidence base, and ε i is the disturbance term because of sampling error.
We estimate models (5, 7 and 8) with heteroskedasticity-robust standard errors, one-way cluster-robust standard errors and two-way cluster-robust standard errors. The first is robust to heteroskedasticity so long as the observations are independent. Cluster-robust estimation corrects the downward bias in the standard errors (Wooldridge, 2002); and relaxes the assumption of independence between observations between effect-size estimates reported by a given study. Finally, two-way cluster-robust estimation provides standard errors corrected for dependence within each study and within the group of studies that use the same corruption data source. The method is reported to yield less biased standard errors when the number of f-clusters (in our case the number of countries) and t-clusters (in our case, the number of corruption data sources) are sufficiently large. In our case, the number of f-clusters is large but the number of t-clusters is small. In such cases, the method is reported to yield standard errors that are comparable to or more reliable than those produced by one-way clustering based on the large number of clusters (Thompson, 2011;Petersen, 2007;Cameron et al., 2008;and Gow et al., 2010). 6 Hence, we use the two-way cluster-robust estimation both as an additional check for robustness and as our preferred estimation for inference. Table 1 presents weighted means of the PCCs for each study and other information that depicts some dimensions of the research field. A simple vote count reveals that only two primary studies (6.8% of the total) and 44 estimates (13.5% of the total) yield positive weighted means. Of these, only one study with five estimates (Rahman et al., 2000) yields a positive and statistically significant weighted mean. The number of studies yielding negative and statistically-significant weighted means is 19 (65.5% of the total), providing 280 estimates (85% of the total). A large majority of the negative and significant weighted means (14 out of 19) are greater than 0.1 in absolute value. Hence, we can conclude that PCCs calculated for each primary study tend to indicate a negative association between corruption and per-capita GDP growth.  The second observation concerns the potential sources of heterogeneity in the evidence base. For example, the number of estimates reported by each study differs significantly; and the estimates are based on four different corruption indices, with ICRG and TI corruption data used more heavily compared to WGI and Other corruption data sources. Another source of heterogeneity is the different lengths of the time periods over which the data is averaged to estimate the growth models. The length of the period for averaging indicates whether the effect-size estimates relate to the long-run (more than 5 years), medium-term (between 3 and 5 years) or short-term (1-2 years) effects.

Fixed-Effect Weighted Means
Although the weighted means of the PCCs tend to be negative, they cannot be taken as reliable evidence of negative effect as the underlying estimates may be contaminated with selection bias. First, we present funnel plots to inspect bias visually. Funnel plots indicate the extent to which statistically-significant results are treated more favourably by editors and referees; and by authors pre-disposed to justify model selection (Stanley, 2008). In the absence of selection bias, a funnel graph should be symmetric and reflect more (less) variation in the PCCs with low (high) precision (Stanley, 2008). Figure 1 above presents two funnel plots with pools of PCCs based on all-country and low-incomecountry samples. The funnel plot for all-country PCCs reflect less asymmetry compared to the funnel plot for low-income-country PCCs. Although funnel plots are useful devices for visual inspection of publication selection bias, the magnitude of the bias and its significance need to be verified. Also, we need to establish whether a genuine effect exists beyond bias.

Meta-Analysis Results
For estimation, we pool the evidence on the basis of two country types included in primary study samples (LICs only and mixed countries) and a third pool consisting of the full set of evidence. Panel A of Table 2 presents the PET-FAT results, based on WLS estimation with heteroskedasticity-robust and cluster-robust standard errors. Panel B presents the results of the PEESE estimation. The PET/FAT results indicate that the precision's coefficient (β 0 ) is negative and significant for all country types. However, there is evidence of bias in favour of publishing studies that tend to report negative effect-size estimates. The bias is substantial as the constant term is greater than one in magnitude. These results are robust to controlling for within-study dependence. Given that genuine effect exists after controlling for selection bias, we report the PEESE results in Panel B, which take account of the nonlinear relationship between the PCCs and their standard errors Doucouliagos, 2007, 2012, chapter 4). The latter are in line with the findings in Panel A, but they also indicate stronger effect (β 0 ), which is now about −0.07 for mixed-country samples and the full sample and −0.26 for LICs. Doucouliagos and Ulubasoglu (2008: 65), drawing on Cohen's (1988) guidelines, indicate that the PCC represents small effect if its absolute value is less than 0.10, medium effect if it is 0.25 and over, and large if it is greater than 0.4. Using this criterion, we conclude that corruption has a weak effect on per-capita GDP growth in mixed-country samples and the full set; and moderate effect in LICs. However, it must be noted that the moderate effect in LICs is based on a narrow evidence base consisting of 16 estimates only.
The PET-FAT results measure the overall effects for the typical study, but they are based on the assumption that the moderating variables are equal to their sample means and independent of the standard error. As such, they overlook the potential sources of heterogeneity and attribute the bias only to publication selection. Therefore, we estimate a multivariate meta-regression that includes the moderating variables given in Table 3. These are dummy variables that are equal to 1 if the estimate reported in the primary study is dependent on the characteristic captured by the variable and zero otherwise.  Our choice of moderating variables is informed by the theoretical and empirical dimensions of the research field and other factors likely to affect the estimates in primary studies. For example, endogeneity of the corruption scores and reverse causality between the latter and growth are significant issues that may confound the reported estimates (see Kurtz and Schrank, 2007). Furthermore, Kaufmann et al. (2007) demonstrates that reverse causality may depend on the time horizon over which growth effects of corruption is calculated.
In the presence of endogeneity, OLS estimates are biased and inconsistent; and hypothesis tests can lead to misleading inference. Therefore, studies on the relationship between institutional quality and economic performance control for endogeneity using exogenous instruments that are correlated with measures of institutional quality but not with the error term of equation (2). 7 Following this tradition, most of the primary studies reviewed here address endogeneity by conducting instrumented estimations instead of or in addition to noninstrumented methods. For example, Easterly et al. (2006) and  use the ethnic fractionalisation index as instruments for corruption. Other studies (for example Li et al. 2000;Ahlin and Pang 2008;Pellegrini and Gerlagh 2004;Attila 2008;Haque and Kneller 2008) carry out two-stage or three-stage least-square (2SLS or 3SLS) estimations, using different instruments. A third set of studies utilise a general method of moments (GMM) estimator proposed by Arelleano and Bond (1991). Studies using GMM include Gyimah-Brempong (2002) and Aixala and Fabro (2008). Therefore, in the MRA, we control for instrumental variable estimation methods to establish if the latter yield systematically different effect-size estimates.
A second dimension of the research field relates to the length of the period over which the dependent and independent variables are averaged. Kaufman et al. (2007) and Gwartney et al. (2004) report the effect of institutional quality on per-capita GDP growth is felt in the long run, usually after 5-10 years. Therefore, we control for the time horizon to verify if the effect of corruption on per-capita GDP growth is larger when the data is averaged over periods longer than 5 years.
Two further issues have been discussed in the literature: whether corruption has a regime-specific effect on growth; and whether different corruption scores measure the same phenomenon. According to  and Méon and Weill (2010), corruption would have weaker or no effects on growth in countries with poor governance quality overall. Given that LICs tend to cluster at the lower end of the governance quality indices, we use the LIC dummy to test the hypothesis of regime-specific effects.
Regarding corruption data sources, Kaufmann et al. (2009) report high levels of correlation (0.65 or over) between existing governance indicators in general and corruption indicators in particular. Although correlated, corruption indicators also reflect significant differences that may be related to different sampling strategies or intended end-users (business people, advocacy groups, governments, etc.). With respect to the latter, ICRG tends to cater for business decision-makers; TI data aims to inform decisions by anti-corruption advocacy groups and governments; and WGI data is intended to inform decisions by governments and development agencies. Finally, ICRG data is provided at a fee whereas TI and WGI data are available free of charge. Therefore, we control if such variations have systematic effects on the effect-size estimates reported in primary studies.
Finally, we control for two factors related to publication type and whether studies have been published after 2008 (i.e., in the last 3 years of the period for this study). With respect to the former, we check whether journal papers tend to report systematically different estimates in comparison to other publication types such as book chapters, and working papers. Controlling for publication type enables us to verify whether authors and/or journal editors are pre-disposed to publish studies with significant effect estimates to justify model selection (Card and Krueger, 1995;Sterling et al., 1995;Stanley, 2008) or editors may be faced by what Costa-Font et al. (2013) describe as 'winner's curse' that allows the better journals to publish more intensely selected studies. We control for recently-published studies to verify if the reported effect-size estimates tend to become smaller as more researchers challenge the status quo with richer datasets. According to Gehr et al. (2006), empirical studies will tend to report smaller effect-size estimates over time because of use of larger samples and falsification efforts that follow the initial findings.
To capture all these dimensions of the data, we estimate the MRA model (8) with all moderating (Z) variables summarised in Table 3. However, inclusion of a large number of variables may cause over-determination and multicollinearity problems. Therefore, we follow a general-to-specific modelling procedure to reduce model complexity and ensure congruency at the same time. The procedure involves removal of the most insignificant variables (i.e. those with the largest p values) from the model one at a time until all included variables are significant (Charemza andDeadman, 1997 andDoucouliagos andStanley, 2009; for a review of the literature on general-to-specific modelling, see Campos et al., 2005). Table 4 below presents the MRA results from the specific model, with heteroskedasticity-robust, one-way cluster-robust and two-way cluster-robust standard errors. In all estimations, the F-test indicates that the covariates are jointly significant and the moderating variables account for 26.5% of the variations in t-values. We take the two-way cluster-robust estimation as the preferred model because it accounts for two types of dependence: (1) dependence between multiple estimates reported by a given study; and (2) dependence between estimates reported by different studies that use the same corruption data source.
The precision's coefficient is insignificant in all estimations. Hence, it can be stated that the overall effect of corruption on per-capita GDP growth is insignificant when we control for the moderating variables. However, the MRA results indicate that the moderating variables have significant effects on the effect-size estimates reported in primary studies. In other words, corruption's marginal effect is conditional on the moderating variables and can be calculated as the sum of all significant coefficients on the dummy variables -excluding the constant term which captures the residual selection bias. Hence, when the modelled characteristics hold, the marginal effect of corruption on per-capita GDP growth is −0.193 ( = -0.060 -0.114 -0.107 + 0.065 + 0.023).
In a comprehensive study that aims to identify the explanatory variables that have a significant effect on growth, Sala-i-Martin et al. (2004) use a Bayesian approach to averaging of OLS coefficients across models. They report that 18 variables are partially correlated with growth, including primary school enrolment, initial level of income, regional dummies, religious dummies and relative price of investment goods. Corruption is not included in that analysis on the grounds that corruption data does not go back enough to test for long-run relationship. Hence, our finding bridges the gap in the evidence on the corruption-growth relationship by demonstrating that higher levels of perceived corruption are associated negatively with per-capita GDP growth, when relevant moderating factors are taken into account.
Results in Table 4 indicates that the effect of corruption on per-capita GDP is more adverse when primary studies take averages of the corruption data over periods longer than 5 years, they use data for LICs only and are published journal papers rather than book chapters or working papers. The stronger negative effect associated with the time period over which corruption data is averaged indicates that corruption has more adverse effects on per-capita GDP in the longer term as opposed to short term. This finding is in line with the literature that indicates that it takes 5-10 years for governance quality to affect economic performance (Gwartney et al., 2004;Kaufmann et al., 2007;). Thus, our meta-regression estimates a small adverse effect on growth, −0.193, for published studies when researchers use ICGR data averaged over 5 years and employ 2SLS. For non-LICs, this estimate is notably less adverse, −0.079.
LICs tend to score low on all corruption control indices. They also have lower scores for other institutional quality indicators such as government effectiveness, rule of law and accountability (see, Kaufmann et al., 2009). Hence, the stronger adverse effect on per-capita GDP growth in LICs does not support the findings of  and Méon and Weill (2010), who report that corruption has larger effects in countries with good quality institutions and smaller or no effects in countries with weak institutions. However, two caveats are in order. First, the number of observations with LIC-only data is rather small (16 of 327). Secondly, there is a high degree of correlation between institutional indicators such as accountability, government efficiency, rule of law, etc. and corruption (see, Kaufmann et al., 2009). Therefore, in the LIC-only samples, corruption may be capturing a wider range of institutional weaknesses that are not controlled for in the primary studies.
Journal papers, as opposed to working papers or book chapters, tend to report stronger negative effects. This finding lends support to Sterling et al. (1995), who posits that journal editors' selection criteria may be operating as a filter for significant results; and/or to the conjecture that authors may be pre-disposed to publish studies with significant effect estimates to justify model selection (Card and Krueger, 1995;Stanley, 2008). However, it may also be because of what Costa-Font et al. (2013) describe as the 'winner's curse', whereby journals with higher levels of perceived quality tend to report more intensely selected and biased findings. Our finding lends some support to 'winner's curse' argument because the constant term that captures the residual selection bias is smaller in magnitude (−1.063) in the MRA that controls for journal papers compared to the selection bias (−1.302) in PET/FAT estimations that do not. Removing the 'winner's curse' further lessens corruption's effect in LICs to -0.086 ( = -0.060 -0.114 + 0.065 + 0.023) whereas non-LICs would then have a slightly positive effect from corruption (0.028).
Finally, results in Table 4 indicate that primary studies that use ICRG corruption data tend to report weaker adverse effects compared to others that use WGI or TI data. 8 A similar result is obtained for estimates based on 2SLS. The moderating effect of the ICRG data may be because of two reasons. Unlike WGI data, the ICRG corruption index is based on a single survey conducted by the providers rather than on aggregation of different survey results as it is the case with WGI. Dependence on multiple survey results may reduce the risk of sampling bias in WGI data, but introduces added uncertainty about the comparability of the scores across countries and over time (Kaufmann et al., 2009). Secondly, and unlike other corruption data sources, ICRG data is market-tested. In other words, it is financed by users (mainly international investors and business managers) who would be willing to pay a fee only if they perceive the data as sufficiently informative. WGI and TI data are not market-tested as they are financed through public funds or donations. Hence, our finding suggests that there may be significant differences between alternative corruption data sources and therefore researchers should be encouraged to conduct sensitivity checks to verify if their findings remain robust across different measures of perceived corruption.

Conclusions
The systematic review of the evidence on corruption and growth reveals that primary studies tend to report negative effects. Meta-analysis provides useful tools for evaluating and synthesizing the effectsize estimates, taking into account heterogeneity and publication selection bias. PET-FAT-PEESE results indicate that corruption has a negative effect on per-capita GDP growth, but the magnitude of the effect (−0.072) is small in the full-country sample. The effect is more adverse (−0.258) when the reported estimates are based on LIC data only. However, the effect in LICs should be qualified by the fact that it is based on a small number of observations (16 of 327). The small number of observations on LICs is because of data constraints faced by primary studies and indicates an evidence gap that restricts the scope for evidence-based decision-making and practice.
Results from MRA are conditional on potential sources of heterogeneity in the evidence base. Therefore, they provide more nuanced evidence on the relationship between corruption and per-capita GDP growth. They indicate that corruption's adverse effect is more pronounced when the underlying primary-study estimates relate to longer-rather than shorter-term effects, are based on LIC data only, and are reported in journal papers rather than book chapters or working papers. On the other hand, corruption's adverse effect is reduced when the underlying primary-study estimates are based on ICRG corruption data and 2SLS that takes account of endogeneity in growth regressions. When the effect of these moderating variables is taken into account, corruption is harmful to per-capita GDP growth. However, when the research field is not characterised by these factors, corruption does not have significant effect on per-capita GDP growth.
Our findings indicate that meta-analysis is an effective tool for synthesizing evidence when the evidence base is too diverse to allow for general conclusions. This study has enabled us to derive verifiable conclusions about the marginal effects of corruption on per-capita GDP growth, while accounting for about 26% of the heterogeneity in the evidence base. The synthesized evidence indicates that interventions aimed to reduce corruption in LICs may be justified given the relatively more adverse growth-effects reported for these countries. However, it also indicates that more investment in data compilation and research is required to ensure that policy and practice against corruption in these countries are supported by robust evidence. UGUR Weightman for their comments on the protocol; Toke Aidt and Ian Shemilt for their comments on the review report; and the referees whose comments and suggestions have made a significant contribution to the quality of the outcome. Finally, I would like to thank Nawar Hashem and Sefa Awaworyi for their assistance during the course of the project. I remain responsible for any errors or omissions. Notes 1. The systematic review was conducted between May and December 2010. The systematic review protocol documenting the search strategy, the study inclusion/exclusion criteria and data extraction procedure can be accessed at http://www.dfid.gov.uk/r4d/PDF/Outputs/ SystematicReviews/Corruption_impact_2011_Ugur_report.pdf 2. See, International Country Risk Guide (ICRG) at http://www.prsgroup.com/ICRG.aspx; Transparency International (TI) at http://www.transparency.org/policy_research/surveys_indices/cpi; and Worldwide Governance Indicators (WGI) at http://info.worldbank.org/governance/wgi/sc_country.asp 3. The search terms are specified in the review protocol referenced in note 1 above.

The PIOS framework is informed by the PICOS (Population-Intervention-Comparator-Outcome-Study
Design) framework used in Cochrane intervention reviews (see, CRD, 2009: 7-10). The difference between the two is the absence of 'comparator' (i.e. control group) in our review, which synthesis effect-size estimates derived from observational data rather than randomised control trials (CRTs). 5. Other corruption data sources include: Business Environment Risk Intelligence at http://www.beri.com/; Dreher et al. (2007) Sachs and Warner (1997) index at http://jae.oxfordjournals.org/ content/6/3/335.full.pdf±html 6. For two-way cluster-robust estimation, we used the Stata procedure produced by Mitchell Petersen, which can be accessed at http://www.kellogg.northwestern.edu/faculty/petersen/htm/ 7. In this tradition, Acemoglu et al. (2001) use colonial mortality rates as instrument whilst Knack and Keefer (1997) use ethnic fractionalisation indices. On the other hand, Rodrik et al. (2004) use two-stage least squares to estimate the determinants of institutions first and the effect of the latter on growth thereafter. 8. This result is confirmed in PET/FAT estimations we have conducted by pooling the evidence along corruption data sources. In those estimations, which are not reported here but can be provided on request, the precision's coefficient is −0.021 for ICRG data compared to −0.047 for TI data and −0.53 for WGI data.