CORRUPTION'S DIRECT EFFECTS ON PER-CAPITA INCOME GROWTH: A META-ANALYSIS

Authors


Abstract

Corruption is a symptom of weak institutional quality and could have potentially adverse effects on economic growth. However, heterogeneity in reported findings makes it difficult to synthesize the evidence base with a view to test competing hypotheses and/or support evidence-based policy and practice. To address this issue, we have extracted 327 estimates of corruption's direct effect on per-capita GDP growth from 29 primary studies, following a peer-reviewed and pre-published systematic review protocol. Precision-effect and funnel asymmetry tests indicate that corruption has a negative effect on per-capita GDP growth after controlling for publication selection bias and within-study dependence. However, multivariate meta-regression analysis results indicate that the overall effect is not robust to inclusion of moderating variables through a general-to-specific procedure for model specification. We report that the marginal effect of corruption on per-capita GDP growth is more adverse when the primary study estimates relate to long-run growth, are based on low-income-country data only, and extracted from journal papers. The effect is less adverse in studies that use the International Country Risk Guide corruption perceptions index and in those reporting estimates from two-stage least-squares estimations.

1. Introduction

Corruption is an ancient problem, with which social scientists and policymakers have grappled for centuries (Bardhan, 1997). Nevertheless, our search results indicate that the increase in the number of studies on the relationship between corruption and growth is a recent phenomenon, with average hits increasing from about 10 per year in 1990 to 25 in mid-1990s and to more than 100 in 2009. The increased research effort coincides with heightened policy concerns that liberalization reforms in the post-cold-war period were not delivering the expected growth benefits because of institutional weaknesses in general and prevalence of corruption in particular. Hence, it appears that the earlier emphasis on corruption as a by-product of regulation and government intervention (Huntington, 1968) have been overtaken by a new debate on how corruption may coexist with market reforms, posing a new challenge for policy and practice.

The empirical work on corruption and growth combines the institutional approach to economic performance (North, 1990, 1994) with the empirics of growth literature (Barro, 1991; Levine and Renelt, 1991; Mankiw et al., 1992). The findings are heterogeneous not only because of different measures of corruption and growth, but also different estimation methods, country coverage, and sample periods. Hence, policymakers and researchers alike find it difficult to derive verifiable conclusions about corruption's effect on growth. This paper aims to bridge the evidence gap by conducting a systematic review1 of the literature and providing synthesized evidence based on meta-analysis methods.

The paper is organised in five sections. Section 'Corruption and Growth: A Brief Overview' provides a brief overview of the empirical growth model that informs the estimation strategies of the included primary studies. Section 'Systematic Review Methodology' summarizes the systematic review methodology that informs our search strategy, inclusion and exclusion criteria, and data extraction/coding procedures. Section 'Meta-analysis Tools and Methods' introduces the meta-analysis tools/methods and the findings. Finally, Section 'Conclusions' provides a summary of the findings and derives some policy and research conclusions.

2. Corruption and Growth: A Brief Overview

Studies reviewed here use cross-section or panel data to estimate empirical growth models that follow Barro (1991) and Mankiw et al. (1992). In these models, growth depends on initial level of per-capita income, technical efficiency and investment in physical and human capital. Another feature of the empirical growth models is that technical efficiency grows at the same rate of (g) across countries, although its initial level varies between countries because of country-specific factors such as geography. Later empirical work introduces institutional quality, openness to trade or financial development as additional factors that influence technological efficiency. Hence, model building in the empirical growth literature consists of two components: a formal component that draws upon a Cobb-Douglas production function augmented with human capital; and an informal component driven mainly by empirical innovations (Temple, 1999). The primary studies included in this review innovate within a Cobb-Douglas production function with human capital and constant returns to scale, which is given in (1) below.

display math(1)

Here, math formula is growth rate of per-capita GDP, sk and sh are rates of investment in physical and human capital, n is rate of population growth, g is growth rate of technical efficiency, δ is rate of capital depreciation, math formula are coefficients to be estimated, vi is country fixed effect and uit is the error term. This model has been estimated by various contributors, including Levine and Renelt (1991), Mankiw et al. (1992), Sachs and Warner (1997); and Gyimah-Brempong and Traynor (1999). Researchers who estimate the direct effect of corruption on growth innovates within this model by adding two types of variables: (1) a perceptions-based measure of corruption; and (2) other variables reported to be relevant in previous empirical work, such as financial development, openness to trade, foreign direct investment, etc. Hence, a generic form of the model used in primary studies can be stated as follows:

display math(2)

Here, Corr is the corruption variable and its coefficient (θ) is the effect-size estimate reported in primary studies. CVf is the vector of the variables included in the formal model whilst CVk is the vector of control variables that reflect innovations related to trade openness, financial development, etc. The meta-analysis in this paper is based on coefficient estimates that measure corruption's direct effects on per-capita GDP growth – namely (θ). The focus on direct effects only is because of small number of studies that estimate indirect effects (three in total) and the difficulty in pooling together the indirect effects that unfold via three different channels: human capital, public finance and investment.

Among meta-analysis studies on growth, Abreu et al. (2005) try to establish the magnitude of the convergence rate. Doucouliagos (2005) and Efendic et al. (2011), respectively, meta-analyse the effect of economic freedom and institutional quality indices on growth. These studies report that the rate of convergence and the effect size are larger in magnitude when the bias resulting from unobserved heterogeneity is controlled for. Therefore, we control for heterogeneity through multivariate meta-regression analysis (MRA). We derive the MRA model through a general-to-specific procedure, which involves exclusion of insignificant covariates with the largest p values one at a time until all remaining covariates are significant (Charemza and Deadman, 1997; Doucouliagos and Stanley, 2009). This procedure is preferred because of its tractability and its effectiveness in addressing the issue of overdetermination in the MRA model to be estimated.

Huntington (1968) argued that corruption may ‘grease in the wheel’ by enabling economic agents to engage in beneficial activities that may be otherwise unfeasible because of high levels of bureaucratic hold-ups in highly-regulated countries. However, Myrdal (1968) demonstrated that the hold-ups that private agents try to circumvent through corruption and the level of corruption itself should be considered as symptoms of underlying institutional weaknesses that enable the political elite to maximise bribes revenue by increasing the level of administrative hold-ups.

North's (1990, 1994) seminal contributions on institutions and economic performance appears to have vindicated Myrdal's critique and have informed a wide range of studies on the relationship between corruption and economic performance. Initially, the new work focused on microeconomic efficiency, investigating the effects of corruption on entrepreneurial skills and technology choice. This work became prominent in the 1980s and early 1990s; and is reviewed in Svensson (2005). The focus shifted to the macrolevel with Mauro (1995). Although Mauro (1995) did not find a significant relationship between corruption and growth, Mauro (1997) used a larger dataset and concluded that the effect of corruption on investment and per-capita income growth was negative and statistically significant. A one-standard-deviation improvement in the control of corruption is found to be associated with a 4 percentage point increase in investment rate and a 0.5 percentage point increase in per-capita income growth per annum. Mo (2001) and Méon and Sekkat (2005) lend support to Mauro's findings, which indicate that corruption is detrimental to economic growth.

Bardhan (1997) is the first review of the work on both micro- and macrolevel effects of corruption and offers two conclusions. First, the static effects of corruption on efficiency can be positive or negative, depending on the severity of the policy-induced distortions and organisation of corruption (i.e. whether corruption is centralised or decentralised). Secondly, the effect of corruption on growth depends on how it affects the composition of investment, the quality of public finance and the incentives to invest in physical and human capital. Although Bardhan (1997) suggests that corruption's effect on growth is likely to be negative, he also indicates that this conclusion is based on historical experience rather than contemporary empirical research, which was at its infancy at the time. Two years later, Wei (1999) reported an increase in the number of empirical studies and concluded that corruption's adverse effect on growth results from reduced domestic investment, discouraged foreign direct investment, overspending in government and distorted composition of government spending. More recently, Aidt et al. (2008) and Méon and Weill (2010) have reported more nuanced findings. They indicate that corruption has regime-specific effects on growth, with weaker effects in countries with weak institutions.

To sum up, the existing literature tends to indicate that the effect of corruption on growth is likely to be negative, but would vary between countries and types of corruption. This variation is highly likely to be augmented by other moderating factors that characterise the research field, including differences in estimation methods, data structure, model specifications and corruption measures used. In this context, meta-analysis offers objective and verifiable means for synthesizing the evidence base. To ensure the objectivity and replicability of the in this paper, we follow a peer-reviewed and pre-published systematic review protocol that specifies the search, inclusion/exclusion and data extraction criteria.

3. Systematic Review Methodology

A systematic review of the corruption-growth relationship poses specific challenges because corruption is an illicit practice and as such it is difficult to measure. Also, corruption may involve different practices in different countries. To address these challenges, we adopt a principal-agent definition of corruption as suggested by Groenendijk (1997). In this definition, the agent (usually an appointed or elected public official) abuses his/her public authority and imposes a surcharge on the delivery of a service to a principal, who may be a natural or legal entity unable to hold the principal to account because of high monitoring costs.

The principal-agent definition is inclusive enough to accommodate different types of corruption, but it is by no means a universally agreed one (see Philp, 2006; Banerjee et al. 2012). We adopt the principal-agent definition because it captures bureaucratic or political corruption, bribes and nepotism or frauds and embezzlement. It also includes practices that distort the function of the public office, whether these practices are formally illegal or not. A blanket exclusion of acts that are not banned legally, as proposed by Philp (2006) and Banerjee et al. (2012), may entail underestimating the level of corruption and the endogenous nature of the legal provisions that differentiate between corrupt and noncorrupt behaviour.

Indeed the corruption data sources used in the primary studies do not distinguish between legal and illegal corruption; and try to capture corrupt practices that fit well with the principal-agent definition. For example, the International Country Risk Guide (ICRG) index is constructed to capture the perceptions of respondents with respect to ‘actual or potential corruption in the form of excessive patronage, nepotism, job reservations, ‘favour-for-favours’, secret party funding, and suspiciously close ties between politics and business’. The Transparency International (TI) index, on the other hand, aims to capture ‘information about the administrative and political aspects of corruption’, through questions related to ‘bribery of public officials, kickbacks in public procurement, embezzlement of public funds’ and questions that ‘probe the strength and effectiveness of public sector anti-corruption efforts’. Finally, the World Governance Indicators (WGI) scores aim to capture ‘perceptions of the extent to which public power is exercised for private gain, including both petty and grand forms of corruption, as well as ‘capture’ of the state by elites and private interests.’2

Our systematic review methodology draws on guidelines proposed by the Centre for Reviews and Dissemination of the University of York (CRD, 2009), which reflects best practice in systematic reviews registered with Campbell and Cochrane Collaborations. We searched in 20 electronic databases for journal papers, working papers, reports and PhD theses; using 32 keywords for corruption, growth, developing countries, and 43 low-income-country (LIC) names.3 The search produced 1,042 hits, the title/abstract screening of which was carried out on the basis of PIOS (PopulationIndependent variableOutcomeStudy design) criteria.4 The title/abstract screening led to selection of 338 studies for critical evaluation. In this second stage, we evaluated the studies on the basis of full-text information using validity, reliability and applicability (VRA) criteria. The critical evaluation led to inclusion of 40 empirical studies, of which 32 studies reported effects on per-capita GDP growth. The remaining 8 studies reported effects on GDP or per-capita GDP levels. We extracted data from all included studies (40 studies), but the data on the relationship between corruption and GDP and per-capita GDP levels is not used in this meta-analysis for two reasons. First, we wanted to focus only on corruption's effect on per-capita GDP growth, which is the outcome variable in model (2) above. Secondly, the evidence base on the relationship between corruption and income levels is small.

Of the included studies, only three estimate the indirect effects of corruption on per-capita GDP growth – i.e., the effects of corruption on growth when corruption is interacted with investment, public finance and human capital. We have excluded these estimates from the meta-analysis too because they refer to three different transmission channels that cannot be aggregated. Hence, our focus is only on the direct effects of corruption on per-capita GDP growth reported in 29 primary studies.

Primary studies draw upon four main sources of corruption data, which consists of scores based on perceived levels of corruption. The scores have different scales, which range from 0 to 6 for ICRG data; from −2.5 to +2.5 for WGI data; from 0 to 12 for TI data; and different ranges in Other corruption data sources.5 With the exception of TI data, an increase in the score refers to less corruption. To ensure consistency, most of the original studies transform the corruption indices such that the scale is kept the same but an increase in the score is made to reflect increased corruption. In the minority of cases where such transformation is not carried out, we have coded the reported estimates as ‘not transformed’ and we multiplied the reported estimates with (−1) to ensure sign compatibility.

For meta-analysis, we have extracted all direct-effect estimates reported in included studies. The alternative would have been to choose a summary statistic (for example the average or median) for each study or a single estimate chosen on the basis of significance or sample size. Such alternatives, however, would have two major shortcomings. First, they would prevent the use of all available information. Secondly, the selection criterion is likely to have a subjective dimension that could bias the results or at least reduce the scope for comparability or replicability of the findings in different meta-analysis studies (de Dominicis et al., 2008; Stanley, 2008; and Stanley and Doucouliagos, 2009).

4. Meta-analysis Tools and Methods

We first calculate partial correlation coefficients (PCCs), which measure the association between corruption and per-capita GDP growth whereas other explanatory variables are held constant. PCCs are comparable across studies as they are independent of the metrics with which the independent and dependent variables are measured. An alternative measure would have been elasticities, which are also comparable across studies and measure the effect size in an economically more meaningful manner. However, primary studies do not provide sufficient information necessary to calculate elasticities. Therefore, partial correlations are used extensively in meta-analysis (see Doucouliagos and Ulubasoglu, 2008; Doucouliagos and Laroche, 2009).

Secondly, we calculate fixed-effect weighted means of the PCCs to provide a descriptive summary of the evidence reported by each study. Third, we provide funnel plots for visual inspection of publication selection bias. In the fourth stage, we conduct precision-effect and funnel-asymmetry tests (PETs and FATs) to quantify the publication selection bias and to verify if a genuine effect exists after controlling for publication selection. In the fifth stage, we conduct multivariate MRA by modelling the sources of variation explicitly. By combining results from a number of studies and regressing the estimates on moderating factors that characterise the research field, the MRA provides more reliable effect estimates than individual studies themselves or group of studies (Stanley and Jarrell, 1989; Kulinskaya et al., 2008).

4.1 Estimators and Models

The PCC (ri) for each effect-size estimate and its standard errors (seri) are calculated in accordance with (3) and (4) below.

display math(3)

and

display math(4)

Here, ti and dfi are t-statistic and degrees of freedom associated with effect-size estimates reported in primary studies. The standard error (seri) represents the variation because of sampling error and the inverse of its square is used as weights to calculate fixed-effect estimators (FEEs) for study-based weighted means. The latter takes account of within-study variation by assigning lower weights to less precise estimates. Weighted means are more reliable than simple means; but they cannot be considered as genuine measures of effect size if primary-study estimates are subject to selection bias and/or affected by within-study dependence between reported estimates (de Dominicis et al., 2008).

In the next step, we conduct PET-FAT analysis to establish whether the effect-size estimates are subject to publication selection bias and whether they represent genuine effects beyond bias. The PET involves estimating a weighted least square (WLS) bivariate model used widely in the literature (see, for example, Abreu et al., 2005; Stanley, 2005; Stanley and Doucouliagos, 2007; Stanley, 2008; and Efendic et al., 2011). Stanley (2008) demonstrates that model (4) below can be used to test for both funnel asymmetry (publication selection bias) and for genuine effect beyond selection bias.

display math(5)

Here ti and seri, respectively, are the test statistic reported by primary studies and the standard error of the PCC as specified in (4) above. The FAT involves testing for α0 = 0, whilst the PET tests for β0 = 0. The FAT is known to have low power – i.e., low probability of rejecting the null hypothesis when the latter is actually false. Against this weakness, model (4) has the advantage of testing for genuine effect when publication selection bias is controlled for. Doucouliagos and Stanley (2009, 2013) indicate that the selection bias should be considered as substantial if |α0| ≥ 1 and as severe if |α0| ≥ 2. Furthermore, Stanley and Doucouliagos (2007; 2012, chapter 4) indicate that the reported effect-size and its standard error have a nonlinear relationship if the PET indicates the existence of genuine effect. In such cases, they propose a precision-effect estimation with standard errors (PEESE) to obtain a corrected estimate of β0. The PEESE model can be stated as follows:

display math(6)

Dividing both sides by (math formula) to address heteroskedasticity, we obtain model (7) – which must be estimated by suppressing the constant term.

display math(7)

We estimate models (5) and (7) for 3 country types: LICs, mixed-income countries (Mixed) and all countries (All). The PET-FAT-PEESE estimations allow for making inference about the existence or absence of genuine effect for the typical study, but they assume that the moderating variables that may be structurally related to study characteristics are equal to their sample means and independent of SE. Therefore, we also carry out a multivariate MRA to establish the extent to which differences in the moderating variables account for variations in reported effect-size estimates and the verify if the association between corruption and per-capita growth remains robust to inclusion of moderating variables into the model. The MRA specification follows Stanley (2008), Doucouliagos and Ulubasoglu (2008), and Efendic et al. (2011); and can be stated as follows:

display math(8)

As before, (1/seri) is precision, Zji is a vector of binary variables that may account for variation in the evidence base, and math formula is the disturbance term because of sampling error.

We estimate models (5, 7 and 8) with heteroskedasticity-robust standard errors, one-way cluster-robust standard errors and two-way cluster-robust standard errors. The first is robust to heteroskedasticity so long as the observations are independent. Cluster-robust estimation corrects the downward bias in the standard errors (Wooldridge, 2002); and relaxes the assumption of independence between observations between effect-size estimates reported by a given study. Finally, two-way cluster-robust estimation provides standard errors corrected for dependence within each study and within the group of studies that use the same corruption data source. The method is reported to yield less biased standard errors when the number of f-clusters (in our case the number of countries) and t-clusters (in our case, the number of corruption data sources) are sufficiently large. In our case, the number of f-clusters is large but the number of t-clusters is small. In such cases, the method is reported to yield standard errors that are comparable to or more reliable than those produced by one-way clustering based on the large number of clusters (Thompson, 2011; Petersen, 2007; Cameron et al., 2008; and Gow et al., 2010).6 Hence, we use the two-way cluster-robust estimation both as an additional check for robustness and as our preferred estimation for inference.

4.2 Fixed-Effect Weighted Means

Table 1 presents weighted means of the PCCs for each study and other information that depicts some dimensions of the research field. A simple vote count reveals that only two primary studies (6.8% of the total) and 44 estimates (13.5% of the total) yield positive weighted means. Of these, only one study with five estimates (Rahman et al., 2000) yields a positive and statistically significant weighted mean. The number of studies yielding negative and statistically-significant weighted means is 19 (65.5% of the total), providing 280 estimates (85% of the total). A large majority of the negative and significant weighted means (14 out of 19) are greater than 0.1 in absolute value. Hence, we can conclude that PCCs calculated for each primary study tend to indicate a negative association between corruption and per-capita GDP growth.

Table 1. An Overview of the Evidence Base
Studies reporting direct effectNo. of estimatesCorruption dataFixed-effect weighted mean for PCCsSignificantAveraging period (for growth)Averaging period (for corruption)
Ahlin and Pang (2008)46ICRG, TI−0.1183Yes4121
Aidt (2009)20TI−0.2992Yes316
Aidt et al. (2005)30TI−0.1295Yes167
Aidt et al. (2008)31WGI, TI−0.3618Yes167
Aixala and Fabro (2008)2WGI, TI−0.3037Yes1515
Blackburn et al. (2008)39ICRG0.0099No55
Drury et al. (2006)9ICRG−0.0577Yes11
Easterly et al. (2006)1WGI−0.3584205
Everhart et al. (2009)2ICRG−0.0476No11
Gupta et al. (1998)6ICRG, Oth.−0.3855Yes1816
Gupta et al. (2002)10ICRG0.4153Yes1816
Gyimah-Brempong (2002)4TI−0.2908Yes11
Gyimah-Brempong et al. (2006)16TI−0.1741Yes11
Haque and Kneller (2008)6ICRG−0.0417No55
Kalyuzhnova et al. (2009)1TI−0.3752 1812
Law (2006)1ICRG−0.7759 1515
Lee (2006)5Other−0.4620Yes11
Li et al. (2000)18ICRG−0.0123No55
Li et al. (2001)12Other−0.1328Yes55
Mauro (1996)5Other−0.2686Yes164
Meon and Sekkat (2005)8WGI, TI−0.3476Yes191
Mocan (2007)14Other−0.1593Yes201
Naude (2004)6WGI−0.0092No201
Pellegrini and Gerlagh (2004)7TI−0.3573Yes226
Rahman et al. (2000)5ICRG0.2405Yes87
Rock and Bonnett (2004)10WGI−0.1225No133
Shimpalee and Breuer (2006)15ICRG−0.1131Yes11
Tanzi (1998)3TI−0.2014Yes11
Total327     

The second observation concerns the potential sources of heterogeneity in the evidence base. For example, the number of estimates reported by each study differs significantly; and the estimates are based on four different corruption indices, with ICRG and TI corruption data used more heavily compared to WGI and Other corruption data sources. Another source of heterogeneity is the different lengths of the time periods over which the data is averaged to estimate the growth models. The length of the period for averaging indicates whether the effect-size estimates relate to the long-run (more than 5 years), medium-term (between 3 and 5 years) or short-term (1–2 years) effects.

Although the weighted means of the PCCs tend to be negative, they cannot be taken as reliable evidence of negative effect as the underlying estimates may be contaminated with selection bias. First, we present funnel plots to inspect bias visually. Funnel plots indicate the extent to which statistically-significant results are treated more favourably by editors and referees; and by authors pre-disposed to justify model selection (Stanley, 2008). In the absence of selection bias, a funnel graph should be symmetric and reflect more (less) variation in the PCCs with low (high) precision (Stanley, 2008).

Figure 1 above presents two funnel plots with pools of PCCs based on all-country and low-income-country samples. The funnel plot for all-country PCCs reflect less asymmetry compared to the funnel plot for low-income-country PCCs. Although funnel plots are useful devices for visual inspection of publication selection bias, the magnitude of the bias and its significance need to be verified. Also, we need to establish whether a genuine effect exists beyond bias.

Figure 1.

Funnel Plots for Direct Effects on Per-Capita GDP Growth (Partial Correlation Coefficients as Measures of Effect Size).

4.3 Meta-Analysis Results

For estimation, we pool the evidence on the basis of two country types included in primary study samples (LICs only and mixed countries) and a third pool consisting of the full set of evidence. Panel A of Table 2 presents the PET-FAT results, based on WLS estimation with heteroskedasticity-robust and cluster-robust standard errors. Panel B presents the results of the PEESE estimation. The PET/FAT results indicate that the precision's coefficient (β0) is negative and significant for all country types. However, there is evidence of bias in favour of publishing studies that tend to report negative effect-size estimates. The bias is substantial as the constant term is greater than one in magnitude. These results are robust to controlling for within-study dependence.

Table 2. PET-FAT and PEESE Results by Country Type
Panel A: PET/FAT results; with t values as dependent variable
 PET/FAT (robust SEs)PET/FAT (cluster-robust SEs)
 Full setMixedLICsFull setMixedLICs
  1. Standard errors are in brackets. *** p ≤ 0.01; ** p≤0.05; * p ≤ 0.10.

Precision (β0)−0.019*−0.023**−0.181***−0.019*−0.023**−0.181***
 (0.010)(0.010)(0.030)(0.011)(0.009)(0.040)
Bias (α0)−1.302***−1.207***−1.232***−1.302***−1.207***−1.232***
 (0.143)(0.150)(0.325)(0.108)(0.144)(0.019)
N3273111632731116
Adjusted R20.0030.0050.5160.0030.0050.516
Prob. > F0.0510.0230.0000.0640.0080.000
Panel B: PEESE results; with t values as dependent variable
 PEESE (robust SEs)PEESE (cluster-robust SEs)
 Full setMixedLICsFull setMixedLICs
Precision (β0)−0.072***−0.071***−0.258***−0.072**−0.071**−0.258***
 (0.007)(0.007)(0.012)(0.028)(0.025)(0.009)
St. Error (α0)−5.383***−4.975***−4.416***−5.383**−4.975**−4.416***
 (0.665)(0.677)(1.235)(1.669)(1.361)(0.152)
N3273111632731116
Adjusted R20.4140.3930.9720.4140.3930.972
Prob. > F0.0000.0000.0000.0070.0080.000

Given that genuine effect exists after controlling for selection bias, we report the PEESE results in Panel B, which take account of the nonlinear relationship between the PCCs and their standard errors (Stanley and Doucouliagos, 2007, 2012, chapter 4). The latter are in line with the findings in Panel A, but they also indicate stronger effect (β0), which is now about −0.07 for mixed-country samples and the full sample and −0.26 for LICs. Doucouliagos and Ulubasoglu (2008: 65), drawing on Cohen's (1988) guidelines, indicate that the PCC represents small effect if its absolute value is less than 0.10, medium effect if it is 0.25 and over, and large if it is greater than 0.4. Using this criterion, we conclude that corruption has a weak effect on per-capita GDP growth in mixed-country samples and the full set; and moderate effect in LICs. However, it must be noted that the moderate effect in LICs is based on a narrow evidence base consisting of 16 estimates only.

The PET-FAT results measure the overall effects for the typical study, but they are based on the assumption that the moderating variables are equal to their sample means and independent of the standard error. As such, they overlook the potential sources of heterogeneity and attribute the bias only to publication selection. Therefore, we estimate a multivariate meta-regression that includes the moderating variables given in Table 3. These are dummy variables that are equal to 1 if the estimate reported in the primary study is dependent on the characteristic captured by the variable and zero otherwise.

Table 3. Summary Statistics (All dummy variables are divided by se_r)
VariableDescriptionNMeanS.D.MinMax
t valuet-statistics reported in primary studies327–1.51.7–5.83.4
PrecisionInverse of standard error of the partial correlation coefficient32711.07.02.638.8
Corruption data averaging periodDummy variable = 1 if corruption data is averaged over period longer than 5 years3274.97.2034.9
Growth data averaging periodDummy variable = 1 if growth data is averaged over period longer than 5 years3275.77.0034.9
LICDummy variable = 1 if the effect-size estimate is based only on low-income-country data.3270.41.7011.4
WGIDummy variable = 1 if primary studies use World Governance Indicators corruption data3271.02.509.9
ICRGDummy variable = 1 if primary studies use International Country Risk Guide corruption data3276.89.2038.8
TIDummy variable = 1 if primary studies use Transparency International corruption data3275.67.6034.9
OLSDummy variable = 1 if primary studies use OLS estimation as opposed to all other methods3272.45.0034.4
InstrumentedDummy variable = 1 if primary studies use instrumented estimation (2SLS, 3SLS, GMM) as opposed to all noninstrumented methods3277.37.4034.9
2SLSDummy variable = 1 if primary studies use two-stage least-squares estimation as opposed to other instrumented and non-instrumented methods3273.77.4034.9
PanelDummy variable = 1 if primary studies use panel data as opposed to cross-section3271.15.1038.8
Journal paperDummy variable = 1 if the primary study is journal paper as opposed to working paper or book chapter3277.88.3038.8
Publication age dummyDummy variable = 1 if primary study is published after 20083271.50.50.72.8
Se_rStandard error of the partial correlation coefficients3270.120.060.020.38

Our choice of moderating variables is informed by the theoretical and empirical dimensions of the research field and other factors likely to affect the estimates in primary studies. For example, endogeneity of the corruption scores and reverse causality between the latter and growth are significant issues that may confound the reported estimates (see Kurtz and Schrank, 2007). Furthermore, Kaufmann et al. (2007) demonstrates that reverse causality may depend on the time horizon over which growth effects of corruption is calculated.

In the presence of endogeneity, OLS estimates are biased and inconsistent; and hypothesis tests can lead to misleading inference. Therefore, studies on the relationship between institutional quality and economic performance control for endogeneity using exogenous instruments that are correlated with measures of institutional quality but not with the error term of equation (2).7 Following this tradition, most of the primary studies reviewed here address endogeneity by conducting instrumented estimations instead of or in addition to noninstrumented methods. For example, Easterly et al. (2006) and Aidt et al. (2008) use the ethnic fractionalisation index as instruments for corruption. Other studies (for example Li et al. 2000; Mauro, 1995; Ahlin and Pang 2008; Pellegrini and Gerlagh 2004; Attila 2008; Haque and Kneller 2008) carry out two-stage or three-stage least-square (2SLS or 3SLS) estimations, using different instruments. A third set of studies utilise a general method of moments (GMM) estimator proposed by Arelleano and Bond (1991). Studies using GMM include Gyimah-Brempong (2002) and Aixala and Fabro (2008). Therefore, in the MRA, we control for instrumental variable estimation methods to establish if the latter yield systematically different effect-size estimates.

A second dimension of the research field relates to the length of the period over which the dependent and independent variables are averaged. Kaufman et al. (2007) and Gwartney et al. (2004) report the effect of institutional quality on per-capita GDP growth is felt in the long run, usually after 5–10 years. Therefore, we control for the time horizon to verify if the effect of corruption on per-capita GDP growth is larger when the data is averaged over periods longer than 5 years.

Two further issues have been discussed in the literature: whether corruption has a regime-specific effect on growth; and whether different corruption scores measure the same phenomenon. According to Aidt et al. (2008) and Méon and Weill (2010), corruption would have weaker or no effects on growth in countries with poor governance quality overall. Given that LICs tend to cluster at the lower end of the governance quality indices, we use the LIC dummy to test the hypothesis of regime-specific effects.

Regarding corruption data sources, Kaufmann et al. (2009) report high levels of correlation (0.65 or over) between existing governance indicators in general and corruption indicators in particular. Although correlated, corruption indicators also reflect significant differences that may be related to different sampling strategies or intended end-users (business people, advocacy groups, governments, etc.). With respect to the latter, ICRG tends to cater for business decision-makers; TI data aims to inform decisions by anti-corruption advocacy groups and governments; and WGI data is intended to inform decisions by governments and development agencies. Finally, ICRG data is provided at a fee whereas TI and WGI data are available free of charge. Therefore, we control if such variations have systematic effects on the effect-size estimates reported in primary studies.

Finally, we control for two factors related to publication type and whether studies have been published after 2008 (i.e., in the last 3 years of the period for this study). With respect to the former, we check whether journal papers tend to report systematically different estimates in comparison to other publication types such as book chapters, and working papers. Controlling for publication type enables us to verify whether authors and/or journal editors are pre-disposed to publish studies with significant effect estimates to justify model selection (Card and Krueger, 1995; Sterling et al., 1995; Stanley, 2008) or editors may be faced by what Costa-Font et al. (2013) describe as ‘winner's curse’ that allows the better journals to publish more intensely selected studies. We control for recently-published studies to verify if the reported effect-size estimates tend to become smaller as more researchers challenge the status quo with richer datasets. According to Gehr et al. (2006), empirical studies will tend to report smaller effect-size estimates over time because of use of larger samples and falsification efforts that follow the initial findings.

To capture all these dimensions of the data, we estimate the MRA model (8) with all moderating (Z) variables summarised in Table 3. However, inclusion of a large number of variables may cause over-determination and multicollinearity problems. Therefore, we follow a general-to-specific modelling procedure to reduce model complexity and ensure congruency at the same time. The procedure involves removal of the most insignificant variables (i.e. those with the largest p values) from the model one at a time until all included variables are significant (Charemza and Deadman, 1997 and Doucouliagos and Stanley, 2009; for a review of the literature on general-to-specific modelling, see Campos et al., 2005). Table 4 below presents the MRA results from the specific model, with heteroskedasticity-robust, one-way cluster-robust and two-way cluster-robust standard errors.

Table 4. MRA results (WLS estimation, with t values as dependent variable)
 Heteroskedastacity-robust standard errorsCluster-robust standard errorsTwo-way cluster-robust standard errors
Precision0.0170.0170.017
 (0.036)(0.084)(0.053)
Corruption data averaged−0.060***−0.060***−0.060***
over more than 5 years(0.015)(0.021)(0.016)
Low-income countries−0.114***−0.114***−0.114***
 (0.020)(0.033)(0.014)
ICGR corruption data0.065***0.0650.065**
 (0.022)(0.042)(0.027)
Two-stage least-squares0.023*0.0230.023**
 (0.014)(0.015)(0.011)
Journal paper-0.107***-0.107***-0.107***
 (0.017)(0.031)(0.017)
Constant−1.063***−1.063*−1.063**
 (0.218)(0.611)(0.450)
N327327327
Model degrees of freedom6.0006.0006.000
Adjusted R squared0.2650.2650.265
F statistic40.634.034.6
Prob. > F0.0000.0000.000

In all estimations, the F-test indicates that the covariates are jointly significant and the moderating variables account for 26.5% of the variations in t-values. We take the two-way cluster-robust estimation as the preferred model because it accounts for two types of dependence: (1) dependence between multiple estimates reported by a given study; and (2) dependence between estimates reported by different studies that use the same corruption data source.

The precision's coefficient is insignificant in all estimations. Hence, it can be stated that the overall effect of corruption on per-capita GDP growth is insignificant when we control for the moderating variables. However, the MRA results indicate that the moderating variables have significant effects on the effect-size estimates reported in primary studies. In other words, corruption's marginal effect is conditional on the moderating variables and can be calculated as the sum of all significant coefficients on the dummy variables – excluding the constant term which captures the residual selection bias. Hence, when the modelled characteristics hold, the marginal effect of corruption on per-capita GDP growth is −0.193 ( = – 0.060 – 0.114 – 0.107 + 0.065 + 0.023).

In a comprehensive study that aims to identify the explanatory variables that have a significant effect on growth, Sala-i-Martin et al. (2004) use a Bayesian approach to averaging of OLS coefficients across models. They report that 18 variables are partially correlated with growth, including primary school enrolment, initial level of income, regional dummies, religious dummies and relative price of investment goods. Corruption is not included in that analysis on the grounds that corruption data does not go back enough to test for long-run relationship. Hence, our finding bridges the gap in the evidence on the corruption–growth relationship by demonstrating that higher levels of perceived corruption are associated negatively with per-capita GDP growth, when relevant moderating factors are taken into account.

Results in Table 4 indicates that the effect of corruption on per-capita GDP is more adverse when primary studies take averages of the corruption data over periods longer than 5 years, they use data for LICs only and are published journal papers rather than book chapters or working papers. The stronger negative effect associated with the time period over which corruption data is averaged indicates that corruption has more adverse effects on per-capita GDP in the longer term as opposed to short term. This finding is in line with the literature that indicates that it takes 5–10 years for governance quality to affect economic performance (Gwartney et al., 2004; Kaufmann et al., 2007;). Thus, our meta-regression estimates a small adverse effect on growth, −0.193, for published studies when researchers use ICGR data averaged over 5 years and employ 2SLS. For non-LICs, this estimate is notably less adverse, −0.079.

LICs tend to score low on all corruption control indices. They also have lower scores for other institutional quality indicators such as government effectiveness, rule of law and accountability (see, Kaufmann et al., 2009). Hence, the stronger adverse effect on per-capita GDP growth in LICs does not support the findings of Aidt et al. (2008) and Méon and Weill (2010), who report that corruption has larger effects in countries with good quality institutions and smaller or no effects in countries with weak institutions. However, two caveats are in order. First, the number of observations with LIC-only data is rather small (16 of 327). Secondly, there is a high degree of correlation between institutional indicators such as accountability, government efficiency, rule of law, etc. and corruption (see, Kaufmann et al., 2009). Therefore, in the LIC-only samples, corruption may be capturing a wider range of institutional weaknesses that are not controlled for in the primary studies.

Journal papers, as opposed to working papers or book chapters, tend to report stronger negative effects. This finding lends support to Sterling et al. (1995), who posits that journal editors’ selection criteria may be operating as a filter for significant results; and/or to the conjecture that authors may be pre-disposed to publish studies with significant effect estimates to justify model selection (Card and Krueger, 1995; Stanley, 2008). However, it may also be because of what Costa-Font et al. (2013) describe as the ‘winner's curse’, whereby journals with higher levels of perceived quality tend to report more intensely selected and biased findings. Our finding lends some support to ‘winner's curse’ argument because the constant term that captures the residual selection bias is smaller in magnitude (−1.063) in the MRA that controls for journal papers compared to the selection bias (−1.302) in PET/FAT estimations that do not. Removing the ‘winner's curse’ further lessens corruption's effect in LICs to -0.086 ( = – 0.060 – 0.114 + 0.065 + 0.023) whereas non-LICs would then have a slightly positive effect from corruption (0.028).

Finally, results in Table 4 indicate that primary studies that use ICRG corruption data tend to report weaker adverse effects compared to others that use WGI or TI data.8 A similar result is obtained for estimates based on 2SLS. The moderating effect of the ICRG data may be because of two reasons. Unlike WGI data, the ICRG corruption index is based on a single survey conducted by the providers rather than on aggregation of different survey results as it is the case with WGI. Dependence on multiple survey results may reduce the risk of sampling bias in WGI data, but introduces added uncertainty about the comparability of the scores across countries and over time (Kaufmann et al., 2009). Secondly, and unlike other corruption data sources, ICRG data is market-tested. In other words, it is financed by users (mainly international investors and business managers) who would be willing to pay a fee only if they perceive the data as sufficiently informative. WGI and TI data are not market-tested as they are financed through public funds or donations. Hence, our finding suggests that there may be significant differences between alternative corruption data sources and therefore researchers should be encouraged to conduct sensitivity checks to verify if their findings remain robust across different measures of perceived corruption.

5. Conclusions

The systematic review of the evidence on corruption and growth reveals that primary studies tend to report negative effects. Meta-analysis provides useful tools for evaluating and synthesizing the effect-size estimates, taking into account heterogeneity and publication selection bias. PET-FAT-PEESE results indicate that corruption has a negative effect on per-capita GDP growth, but the magnitude of the effect (−0.072) is small in the full-country sample. The effect is more adverse (−0.258) when the reported estimates are based on LIC data only. However, the effect in LICs should be qualified by the fact that it is based on a small number of observations (16 of 327). The small number of observations on LICs is because of data constraints faced by primary studies and indicates an evidence gap that restricts the scope for evidence-based decision-making and practice.

Results from MRA are conditional on potential sources of heterogeneity in the evidence base. Therefore, they provide more nuanced evidence on the relationship between corruption and per-capita GDP growth. They indicate that corruption's adverse effect is more pronounced when the underlying primary-study estimates relate to longer- rather than shorter-term effects, are based on LIC data only, and are reported in journal papers rather than book chapters or working papers. On the other hand, corruption's adverse effect is reduced when the underlying primary-study estimates are based on ICRG corruption data and 2SLS that takes account of endogeneity in growth regressions. When the effect of these moderating variables is taken into account, corruption is harmful to per-capita GDP growth. However, when the research field is not characterised by these factors, corruption does not have significant effect on per-capita GDP growth.

Our findings indicate that meta-analysis is an effective tool for synthesizing evidence when the evidence base is too diverse to allow for general conclusions. This study has enabled us to derive verifiable conclusions about the marginal effects of corruption on per-capita GDP growth, while accounting for about 26% of the heterogeneity in the evidence base. The synthesized evidence indicates that interventions aimed to reduce corruption in LICs may be justified given the relatively more adverse growth-effects reported for these countries. However, it also indicates that more investment in data compilation and research is required to ensure that policy and practice against corruption in these countries are supported by robust evidence.

Acknowledgement

This paper is based on a systematic review funded by the Department for International Development (DFID). I would like to thank DFID for funding this research. I would also like to thank Mark Petticrew and Alison Weightman for their comments on the protocol; Toke Aidt and Ian Shemilt for their comments on the review report; and the referees whose comments and suggestions have made a significant contribution to the quality of the outcome. Finally, I would like to thank Nawar Hashem and Sefa Awaworyi for their assistance during the course of the project. I remain responsible for any errors or omissions.

Notes

  1. 1

    The systematic review was conducted between May and December 2010. The systematic review protocol documenting the search strategy, the study inclusion/exclusion criteria and data extraction procedure can be accessed at http://www.dfid.gov.uk/r4d/PDF/Outputs/SystematicReviews/Corruption_impact_2011_Ugur_report.pdf

  2. 2

    See, International Country Risk Guide (ICRG) at http://www.prsgroup.com/ICRG.aspx; Transparency International (TI) at http://www.transparency.org/policy_research/surveys_indices/cpi; and World-wide Governance Indicators (WGI) at http://info.worldbank.org/governance/wgi/sc_country.asp

  3. 3

    The search terms are specified in the review protocol referenced in note 1 above.

  4. 4

    The PIOS framework is informed by the PICOS (Population-Intervention-Comparator-Outcome-Study Design) framework used in Cochrane intervention reviews (see, CRD, 2009: 7–10). The difference between the two is the absence of ‘comparator’ (i.e. control group) in our review, which synthesis effect-size estimates derived from observational data rather than randomised control trials (CRTs).

  5. 5

    Other corruption data sources include: Business Environment Risk Intelligence at http://www.beri.com/; Dreher et al. (2007) index at http://129.3.20.41/eps/pe/papers/0406/0406004.pdf; Economist Intelligence Unit Country Risk Service and Democracy Index at http://www.eiu.com/public/#; and Sachs and Warner (1997) index at http://jae.oxfordjournals.org/content/6/3/335.full.pdf±html

  6. 6

    For two-way cluster-robust estimation, we used the Stata procedure produced by Mitchell Petersen, which can be accessed at http://www.kellogg.northwestern.edu/faculty/petersen/htm/

  7. 7

    In this tradition, Acemoglu et al. (2001) use colonial mortality rates as instrument whilst Knack and Keefer (1997) use ethnic fractionalisation indices. On the other hand, Rodrik et al. (2004) use two-stage least squares to estimate the determinants of institutions first and the effect of the latter on growth thereafter.

  8. 8

    This result is confirmed in PET/FAT estimations we have conducted by pooling the evidence along corruption data sources. In those estimations, which are not reported here but can be provided on request, the precision's coefficient is −0.021 for ICRG data compared to −0.047 for TI data and −0.53 for WGI data.

Appendix A:: List of Empirical Studies Included in the Meta-Analysis

Ancillary