• Open Access

The Long-Run Impact of Foreign Aid in 36 African Countries: Insights from Multivariate Time Series Analysis


  • We wish to express our sincere appreciation for most insightful comments and constructive critique from two anonymous OBES referees as well as clear editorial guidance that helped improve our paper significantly. Our study was elaborated within the UNU-WIDER project on research and communication on Foreign Aid (ReCom), and we acknowledge financial support from Danida, Sida and UNU-WIDER.


We comprehensively analyse the long-run effect of foreign aid (ODA) on key macroeconomic variables in 36 sub-Saharan African countries from the mid-1960s to 2007, using a well-specified cointegrated VAR model as statistical benchmark. Results provide broad support for a positive long-run impact of ODA flows on the macroeconomy. In contrast, we find little evidence supporting the thesis that aid has been harmful. From a methodological point of view we highlight the importance of transparency in reporting results, especially when the hypothesis being tested differs from theoretical expectations, and we identify reasons for econometrically inadequate results in the literature.

1. Introduction

The question whether foreign aid is effective or not has since the seminal contribution on ‘aid, savings and growth’ by Papanek (1972) divided academics and aid practitioners into several camps (see Tarp, 2006). Some are disappointed and highly sceptical, a prominent example being Easterly (2003). He focuses on aid's inability to buy growth. Others, in the middle ground, hold that aid has worked, albeit not perfectly so. They argue, inter alia, that modest expectations are called for, Arndt, Jones and Tarp (2010). A third approach is to view aid as a moral obligation of rich countries that will send ‘forth mighty currents of hope’ and lead to ‘the end of poverty’ (Sachs, 2004).

The polarized nature of the aid debate and the use of cross-country econometric studies as justification for opposing views may seem puzzling. After all, most studies use data from the exact same publicly available databases, including aid and macro data from the Development Assistance Committee (DAC) of the OECD, the Penn World Tables (PWT) and the World Development Indicators (WDI). This implies that differences in results are bound to be embedded in the use of (i) different econometric models and methods, (ii) different exogeneity/endogeneity assumptions and (iii) different choices of data transformations. For example, the literature reports different assumptions about exogeneity and endo geneity of aid as well as different measurements of variables (logs, levels, ratios, growth rates, etc.). Unfortunately, such choices regularly change the empirical results, sometimes crucially so, and can, therefore, be problematic.1

We wish to contribute to the learning about the crucial impact of methodological choices, and in this study offer an econometrically coherent picture and benchmark of aid and its effect on a set of key macroeconomic variables in sub-Saharan Africa (SSA). African examples are often used to suggest that aid is ineffective, on the grounds that African people remain among the poorest in the world despite having been major recipients of foreign aid for several decades. Based on ordinary least squares regression analyses, Dollar and Easterly (1999) argue that aid has a significantly positive effect on investment in only eight of 34 country cases. The question we address here is whether such views are firmly rooted in sound empirical testing and evidence. In contrast with most of the literature we rely on country-based time-series analysis, rather than on cross-country regressions. Our approach is similar to Morrissey (2001) and Gomanee, Girma and Morrissey (2005) in focussing on the long-run impact of aid on GDP and its main macroeconomic determinants (including gross investment, and private and government consumption). We offer a unique perspective in coverage by studying a total of 36 SSA countries for which we were able to get reasonably complete data for the last 50 years (i.e. from the mid-1960s to 2007). Riddell (2007) argues forcefully that country-based evidence provides the only reliable backdrop against which to judge whether aid works or not. Temple (2010) adds nuance, but we depart from the shared observation that many of the econometric methods used in the cross-country (as well as time-series) literature are based on strong assumptions, which need to be satisfied for the conclusions to be valid. It is a matter of concern that these assumptions are not always clearly stated and carefully checked (e.g. Dollar and Easterly, 1999). We recall in passing, the Temple (2010) observation that ‘aid is ineffective’ is now dangerously close to being elevated to a stylized fact in some theoretical studies. This further motivates the empirical analysis pursued here. After all, whether aid works or not cannot be settled based on theory alone. Finally, we deplore the widespread misuse in the literature of insignificant parameters to conclude that aid is ineffective. Temple (2010, p. 4,448), notes: ‘An insignificant coefficient should usually be seen as absence of evidence, not evidence of absence, at least until the economic implications of a confidence interval has been explored’.

To become a satisfactory benchmark, a statistical model must encompass as many aspects as possible of the different econometric choices in the literature. The (cointegrated) VAR (CVAR) model fulfills this requirement. Starting with an explicit stochastic formulation of all variables without constraining them in pre-specified directions, the CVAR provides broad confidence intervals within which empirically relevant claims should fall (Hoover, Juselius and Johansen 2008; and Juselius, 2009). The VAR model is, in its unrestricted form, a convenient reformulation of the covariances of the data and as such can be used as a solid basis for much needed general-to-specific testing (Hendry, 2009). Moreover, because it uses rigorous statistical principles as the criterion for a good empirical model, there is little arbitrariness in the CVAR approach (Spanos, 2009). This makes it optimally designed to embed and shed light on the econometric consequences of typical empirical approaches and choices, including:

  • (i) The use of single equations to estimate the effect of aid (see e.g. the discussion in Hansen and Tarp, 2000). This approach is likely to suffer from endogeneity bias, in particular when weak instrumental variables are used. Instead of assuming aid exogeneity/endogeneity, we model all variables, including aid, jointly as a system of equations and test whether aid is endogenous or exogenous. A system approach has the additional advantage of allowing us to estimate more complicated short-run and long-run dynamic effects of aid.
  • (ii) The use of panel data to estimate the effect of aid on growth (see also Arndt et al., 2010). Panel data models are only adequate in a statistical sense under a number of fairly strict assumptions about the underlying causal mechanisms. As these may not be empirically satisfied we choose instead to estimate individual country models. This allows us to study their similarities and dissimilarities, and can be used to classify the SSA countries into more homogeneous sub-groups which are sufficiently similar to justify subsequent panel data analysis.
  • (iii) The use of cross-sectional analyses to estimate the effect of aid. While such analyses can provide valuable knowledge, they cannot say anything about the dynamic transmission of aid and its important short- and long-run effects on the macro economy. In contrast, a time-series approach makes it possible to study how the macro system adjusts in the short-run to deviations in long-run equilibrium relationships and to study the long-run impact of exogenous shocks.

The structure of the CVAR model allows us to ask two types of questions: First, does a shock to aid have a long-run impact on the macro-variables or not? With the long-run impact, we mean the effect of a change in aid on the variables of the system that remains after the ‘shock’ has run its course through the dynamics of the system and stabilized at the end-point of the socalled impulse–response function. Second, do macrovariables impact on aid, so aid is endogenous or can aid be considered exogenous? How to formulate and test such hypotheses on causal links is discussed in section IV. Based on the outcome of these tests we classify the SSA countries into four groups according to the following diagram. The notation xz (xz) means that variable x has (does not have) a long-run impact on the variable z.


Case I implies that aid and the macro-variables are unrelated in the long-run. Case II captures that aid has no effect on the macro-variables, while the latter are influencing aid. Case III reflects that aid has a long-run effect on the macro-variables, but the reverse does not hold – that is, aid is exogenous with respect to the macro-variables. Finally, Case IV implies interdependence between aid and the macro-variables: aid has a long-run impact on the macro-variables, but the reverse is also true. Our empirical analysis is organized around these four cases, noting that Case I and II are broadly consistent with a thesis of aid ineffectiveness, while III and IV suggest aid is effective in the sense of having an effect on key macro-variables.

To be sure, economic time-series data are generally found to be both unit root non-stationary and subject to structural breaks, and SSA countries are no exception in this respect. Unit root non-stationarity could not be rejected for any of the 36 country data sets. Extraordinary events, such as wars, violent overthrows of government, varying aid conditionality and modalities, etc., have also been frequent in many of the SSA countries studied here. Unless such events are adequately controlled for, statistical inference is likely to be jeopardized. We address this problem by testing whether the most crucial events have shifted the equilibrium relationship between aid and the macro-variables in a permanent way.

In contrast to many studies of aid impact in the literature (e.g. Dollar and Easterly, 1999) we carefully test the validity of the implicit homogeneity assumption behind any use of transformed data, such as GDP per capita or aid as a share of GDP. When data are non-stationary, such testing is particularly important as invalid homogeneity restrictions are likely to change cointegration properties and statistical inference in often unknown ways (Kongsted, 2005).

The remainder of this article is organized as follows: section II introduces our variables and provides a brief overview of the hypothetical transmission mechanisms of aid on the macroeconomy. Section III discusses data transformations and measurements and section IV the CVAR methodology. The econometric test procedures for aid effectiveness/ineffectiveness and aid endogeneity/exogeneity are presented as parameter restrictions on the autoregressive form and interpreted in terms of the long-run impact matrix of the Moving Average form. Section V discusses the empirical model specification for each of the 36 country models, and section VI causality reports the causal test results and classifies the individual countries according to the causal links diagram. Section VII takes a closer look at the sign and significance of the effects of aid on the individual macrovariables, while section VIII summarizes and discusses our results. Section VIII concludes that there is little support to highly critical views of aid and recommends that further research be focused on a small group of countries where the evidence is vexed.

II. Data, method and macroeconomic transmission channels

In line with most aid-effectiveness studies, we rely on DAC ODA net-disbursements as our measure of foreign aid.2 An alternative is the so-called Effective Development Assistance (EDA) indicator, see Chang, Fernandez-Arias and Serven (1998), but this data series is not so long and covers fewer countries. The data used here for these variables are from PWT database of Heston, Summers and Aten (2009) which covers all SSA countries in this study, except for Sudan for which we use data from the WDI database.3

The effect of foreign aid on GDP growth is assumed to be transmitted through its impact on key variables in the macro economy such as investment, and private and government consumption. This is investigated based on the rich structure of a cointegrated VAR model relying on a ‘general-to-specific’ approach, which recognizes from the outset the weak link between the theory model and observed reality. The advantage of this approach is the ability to address highly relevant questions within a stringent statistical framework underpinning the validity of inference. It starts from a general characterization of long-run and short-run regularities in the data structured in a way that can provide broad information about not just one but several, often competing, theories. This is different from the more conventional ‘specific-to-general’ approach where one theory is assumed to have generated the data and inference is valid, provided this is the case. The broad theoretical survey below reflects our choice of the general-to-specific approach.

The first transmitting macrovariable, real gross investment, comprises both private and public outlays. In the Two-Gap model the main idea is that investment is constrained by one of two restrictions (gaps): insufficient domestic savings (in the original Harrod–Domar setup) or low foreign exchange holdings (due to low exports earnings) needed to import capital goods (Chenery and Bruno, 1962; Chenery and Strout, 1966). By filling these financing gaps aid can increase the level of investment and thereby lead to growth (see e.g. Hansen and Tarp, 2000). A third constraint, that is the fiscal gap, was added by Bacha (1990): aid given directly to governments may supplement insufficient domestic tax revenues, financing public investment projects or other needed expense.

The Harrod–Domar and two-gap models have over the years been subject to scathing critique,4 and their widespread and simplistic use in practice have no doubt fuelled over-optimistic expectations about aid's potency in furthering growth. Yet, whether one believes that these models can serve a useful purpose or not, few would dispute the notion that aid (among its other uses) is meant to contribute to growth via investment and capital accumulation; even in the absence of gaps (shortages of funds) aid may still change the equilibrium level of investment. For example, aid flows may help raise private investment through improvements in infrastructure, which are likely to make private investment more profitable.

In addition to investment, some aid is clearly intended for consumption (see e.g. Morrissey, 2001), and it is widely agreed that aid does increase public consumption.5 If such aid is used wisely for growth-enhancing activities in, for example, the health and education sectors other transmission channels are working. On the other hand, aid may also lead to non-productive government consumption or, via tax reductions, to higher private consumption (see e.g. Griffin, 1970; Heller, 1975; White, 1992 for a survey).6 The literature on aid fungibility has emphasized that aid may have undesirable consequences even when earmarking is possible (e.g. Griffin, 1970; Devarajan and Swarup, 1998). In any case, the broad question whether aid impacts on consumption variables or not is of interest.

It seems reasonable to assume that donors’ aid allocation decisions depend on the relevant macrovariables relative to the level of economic activity mostly measured by the level of real GDP. Also, the literature abounds with studies using aid relative to GDP rather than aid as such. Our study thus includes real GDP as one of the relevant macrovariables which allows us to test the validity of imposing such ratios from the outset.

Moreover, we use the CVAR to broadly establish whether changes in foreign aid have had a positive long-run impact on investment and/or real GDP, but also on private and public consumption. When discussing these results, we draw on the above literature and show that our findings are consistent with the theoretical literature.

III. Data transformations and measurements

Macroeconomic variables are typically trending. This suggests that a multiplicative rather than an additive empirical specification is generally desirable, but by taking logs a multiplicative model is brought back into additive form. The arguments for taking logs are mostly econometric: to avoid problems such as non-normality, non-linearity in the functional form, explosive roots in the characteristic polynomial and growing error variance. All of these problems typically afflict non-log models with trending data.7 Many studies in the development aid literature use non-logged aid either for economic reasons or because aid (depending on how it is defined) may exceptionally take negative values. The economic rationale for using non-log aid is often grounded in the above-mentioned ‘gap’ theories: aid is added to domestic savings to compensate for deficient investment or to government expenditure to compensate for deficient public investment in health and schooling, say. This clearly could be accomodated without compromising on econometric concerns by using log(investment + aid). But in this case we would have to impose the prior that aid is used to compensate for deficient savings and would have no way of checking whether this has really been the case. More generally, this approach would exclude the possibility of dealing with the possibility of aid fungibility.

Accordingly, there are weighty arguments both for and against using the log transformation approach, and the final choice must ultimately depend on empirical considerations. As a sensitivity check we estimated all country models in this study using both the log-aid and the non-log aid specification.8 The former was generally superior in terms of model fit and uncorrelated errors as appears from the misspecification diagnostics of the two model versions in Table B1 in the Appendix. Thus, in the present study the econometric problems of a non-log specification seem to produce less reliable inference. For this reason and because the conclusions were fairly robust to the choice of log/non-log aid the subsequent empirical analyses are based on the log-aid specification; recognizing that, when the sample is short, panel data analyses spanning a limited number of years could preferably be based on non-logged data.

Moreover, we note that the logarithmic transformation is innocuous only as long as the variables are strictly positive or not too close to zero. This turned out to be problematic in a few cases. First, the level of foreign aid for most SSA countries is often very low in the first years of the sample period, jeopardizing the validity of the log transformation. We addressed this problem by omitting some of the first annual observations based on a test procedure in Nielson (2008).9 Second, ODA (being a net measure of aid) became negative for both Gabon and Mauritius in 2003 and we had to choose between using the full sample and non-log of aid or the log of aid and a sample ending in 2002. Since the former specification seemed less satisfactory on almost all accounts, we preferred the latter option.

Most empirical models in the literature use ratios, such as aid-to-GDP, GDP per capita, aid per capita, etc. (e.g. Murthy, Ukpolo and Mbaku 1994; Dollar and Easterly, 1999; Gomanee et al., 2005; M'Amanja and Morrissey 2006; Malik, 2008). While frequently used, such data transformations may significantly influence the results unless the implied parameter restriction is empirically valid. For example, in a regression analysis of GDP per capita, inline image on the aid-to-GDP ratio, inline image, the relation, inline image error term, is based on the implicit assumption of homogeneity between GDP, population and foreign aid. Thus, this regression corresponds to the general relation, inline image error term, inline image but with the homogeneity restriction, inline image imposed a priori.10 We tested the hypothesis that  ln Yt and  ln Nt enter homogeneously for a number of countries and it was always strongly rejected. Even more importantly, the order of integration of  ln Nt was found to be close to I(2) in contrast to the other variables (in particular,  ln Yt) that were I(1). Scaling an I(1) variable with an I(2) variable, as in inline image is likely to aggravate the econometric problems of unreliable inference as demonstrated in kongsted (2005).

Another frequently investigated hypothesis is that aid-to-GDP affects investment-to-GDP (see inter aliaBoone, 1996; Hansen and Tarp, 2000). Such a specification involves, however, an implicit homogeneity assumption between GDP, investment and aid.11 Several influential studies have used this type of transformed data without first checking their empirical validity (see e.g. Dollar and Easterly, 1999 and the studies surveyed in Roodman, 2007) despite the ease with which it can be done.

IV. The econometric approach

As already alluded to in the Introduction, the literature contains examples of econometric studies which are based on essentially the same data but which reach opposite conclusions. This is often the consequence of starting from a constrained model where prior assumptions have been allowed to influence the specification of the model. In such a case it is difficult to know which results are due to the assumptions made and which are true empirical facts. Given our wish to remain as objective as possible we have followed a different route: data are not constrained from the outset by prespecified theoretical restrictions unless the empirical adequacy of such restrictions has been tested and accepted, Hoover et al. 2008.

The fact that economic data are often well described by a VAR model suggests that empirically relevant economic models need to be formulated as dynamic adjustment models in growth rates and equilibrium errors, the so-called vector error correction mechanism (ECM) models, which is another name for the CVAR models (e.g. Hendry, 1995; Juselius, 2006). Such models are designed to distinguish between (i) influences that move equilibria, also referred to as pushing forces, which give rise to stochastic trends and (ii) influences that correct deviations from equilibrium, that is, pulling forces, which give rise to long-run relations (Hoover et al., 2008). The division into pulling and pushing is based on the cointegration rank, r, imposed as a reduced rank restriction in the VAR model. The test procedures are in what follows first introduced based on the autoregressive form of the CVAR model and then translated into hypotheses on the long-run impact of aid on the macrovariables based on the moving average form.

The cointegrated VAR model

We consider a five-dimensional VAR model for inline image where aidt stands for ODA, yt for real GDP, invt for real investment, ct for real private consumption, gt for real government consumption and small letters denote logarithmic values. The model is structured around r cointegration relations (the endogenous or pulling forces) corresponding to pr stochastic trends (the exogenous or pushing forces).

The pulling force is formulated as the cointegrated VAR model,


where xt is a p-dimensional vector of economic variables (with p being equal in this case to 5), t is a deterministic trend restricted to be in the cointegration relation, Dt is a m×1 vector of m deterministic terms (such as a constant and dummy variables), ɛt∼N.i.i.d(0,Ω) is a p×1 vector of errors, Δ is the first difference operator, α,β are p×r coefficient matrices, β0 is a r-dimensional row-vector, Γ1 is a p×p matrix of short-run adjustment coefficients, Φ is a p×m matrix of coefficients, and the lag length k in the corresponding VAR in levels is assumed to be at most 2. If k=1, then Γ1=0 and the system, after having been pushed away from equilibrium by an exogenous shock, will adjust back to equilibrium exclusively through α. In the more general case when k=2, the system is also adjusting to lagged short-run changes in Δxt−1 and Γ1 will also influence the adjustment dynamics.

We consider the case r=3 (pr=2) as an example for the data vector at hand:


where inline image is an equilibrium error and αij is an adjustment coefficient.

It is useful to partition the data vector inline image where inline image and inline image to discriminate between long-run effects associated with foreign aid and macrovariables, and effects between the macrovariables alone. The corresponding partitioning of the Γ1 matrix becomes:


where inline image corresponds to the first row and inline image to the first column in Γ1.

In the present setup, a cointegration relation such as inline image would describe an economy where the share of aid to GDP has been stationary over time. Any deviation from its underlying constant level would initiate an adjustment reaction in variable j described by αj1 to bring this ratio back to its mean. The αj1 coefficients would tell us whether it is GDP or aid, say, that take the adjustment after the system has been pushed out of equilibrium. In this context macrovariables are Granger caused by aid if inline image and Γ21≠0 while aid itself is exogenous, that is, if inline image and Γ12=0. In econometric terms the latter is referred to as aid is strongly exogenous with respect to β, or that aid is non-Granger caused by macrovariables. The standard test of Granger causality12 would normally involve the adjustment to the lagged changes, Γ21 and Γ12, but not the long-run relations inline image.

Thus, to provide empirical content to the hypotheses underlying our causal links diagram (presented in the Introduction), it is sufficient to focus on five simple hypotheses formulated as parameter restrictions on the coefficients in β, α and Γ1:

  • (i) inline image: Aid is long-run exogenous.13 This is tested as inline image for the case where r=3 implying a zero row in α for aid. In this case, foreign aid has not been affected by any deviations from long-run equilibria in the macroeconomy, but might have been affected by short-run movements in the macrovariables. In this case, aid has generally had a long-run effect on the macrovariables (unless inline image is also true).
  • (ii) inline image: Aid is exogenous. This is tested as inline image. In this case, aid has affected the macrovariables (unless inline image is also true), but has not been affected by them, neither in the long-run nor in the short-run. When the lag length is one, Γ1=0 and inline image and inline image become identical.
  • (iii) inline image: Aid is purely adjusting, that is, aid is completely endogenous in the system. This is tested as inline image implying that the first column in α is proportional to a unit vector. In this case aid has been exclusively determined by the macrovariables, and shocks (changes) to aid have had no permanent effect on the system.
  • (iv) inline image: Aid is long-run excludable from the cointegration relations. This is tested as inline image implying that the first row of β is zero. In this case, aid has been unrelated with the long-run movements of the macro variables.
  • (v) inline image: Aid is short-run and long-run excludable, that is, inline image. In this case aid has no effect on the macro variables, neither in the short- nor in the long-run.

We now move on to show how these hypotheses can be translated into restrictions on the long-run impact matrix C that correspond to the causal links in reference.

The common trends representation

The pushing forces are analysed in the moving average form of the CVAR model, obtained by inverting equation (2):


where Cinline image is a matrix of rank pr, β and α are the p×pr orthogonal complements of β and α, respectively, C*(L) is a stationary lag polynomial, P0 depends on the initial values, and inline image describes pr autonomous common shocks that have a permanent effect on the variables in the system (see Johansen, 1996). For example, inline image and inline image would describe a situation where shocks to aid and the government consumption/GDP ratio are the exogenous forces.

For the purpose of analysing the long-run impact of aid on the macroeconomy all questions of interest can be interpreted in terms of the long-run impact matrix, C. The element in the ith row and the jth column describes the long-run impact on the ith variable of a shock to the jth variable.

It is useful first to consider the individual elements of the C matrix for our empirical model:


Based on the above partitioning of the data vector inline image the C matrix becomes:


where inline imageinline image

If C21=0, then aid has no long-run effect on any of the macrovariables, and if C12=0, the reverse holds and implies that aid is exogenous. If inline image then aid and the macrovariables are unrelated. The submatrix, C22, describes the long-run effects between the macrovariables alone. The latter effects are outside the focus of this study and will not be discussed.

The causal interpretation above is based on the assumption that the aid residuals, ɛ1,t, measure ‘true shocks’ to aid over time. For example, if aid residuals are strongly correlated with the residuals to the macrovariables, then model (1) may have to be respecified to allow for current effects in the aid equation. This could be because there are omitted variables affecting both aid and the macrovariables, or because expectations about the macroeconomy have influenced aid decisions. In this case, the aid residuals will change and so will the estimated C matrix. However, the residual correlations between aid and the macrovariables were generally very small and insignificant in the country models. Thus, this concern does not seem to represent a problem in the present study.

Under the assumption of no current residual correlations between aid and the macrovariables,14 the hypothesis inline image can be tested as the joint test of inline image and inline image implying that aid is unrelated with the macrovariables. The hypothesis that shocks to aid has no long-run impact on the system inline image can be tested as inline image implying that aid has been adjusting to the macrovariables but not pushing them. The hypothesis that aid is exo genous, C12=0, can be tested as inline image and implies that inline imageTable 1 summarizes the relevant hypotheses and tests within our causal links diagram.

Table 1. Testable hypotheses consistent with causal links between aid and the macrovariables
Macrovariables ↛AidCase I: inline image: inline image and inline image are jointly accepted.Case III: inline image: inline image rejected and inline image (inline image if k=1) accepted.
 Aid and the macrovariables are unrelatedAid is exogenous
Macrovariables →AidCase II: inline image: inline image is accepted. Aid has no-long run impact on the macrovariablesCase IV: inline image: inline image are rejected. Aid has a long-run effect on the macrovariables and vice versa

V. Empirical model specification

Our empirical approach starts from a statistically well-specified VAR model for each of the 36 countries under study and then reduces this general statistical model by simplification testing.15 It responds to the economic questions of interest by embedding the economic model and major institutional events within the statistical model and uses strict statistical principles as criteria for an adequate empirical model.

When it comes to testing specific hypotheses we often face the fundamental challenge that the statistical null does not necessarily coincide with the economic null. For example, while there is broad agreement based on macroeconomic principles (see Rajan and Subramanian, 2008) that aid can be expected to increase growth and should be tested as such, aid ineffectiveness has often been ‘established’ based on (i) testing a statistical null which has been given priority over relevance as an economic hypothesis and (ii) relying on insignificant parameters to draw implications. Instead of reporting ‘starred’ results as an indication of significance at the 5% or 1% level, we shall therefore use empirical rejection probabilities (P-values) as a measure of support for a null which is chosen by statistical convenience rather than by its economic reasonableness.

To put this into perspective, a standard 5% test implies that we are prepared to reject the null hypothesis that aid is ineffective only if there is strong evidence that it is incorrect, that is, when the probability that it can be true is <5%. But the probability of rejecting a correct alternative hypothesis that aid has a positive effect on the macrovariable (making a type 2 error) can be very high even for relatively large and positive parameter values. For example, the probability of rejecting aid effectiveness when the true parameter value (β) is 1.96σ is 50%. For smaller parameter values it is even higher. In small samples like ours, with a maximum of 50 annual observations, inline image is often large and the probability of a type 2 error is likely to be high even for large and positive aid effects. The occurrence of extraordinary events (such as armed conflicts, droughts, lack of institutions, etc.) will often increase inline image hence aggravating the problem.

For these reasons and because aid effectiveness would seem to be a reasonable economic prior, one should in principle require higher P-values than the conventional 5% or 1% to conclude that the empirical evidence is in favour of aid ineffectiveness. But, we also recognize that an estimated aid coefficient with a P-value >0.2, say, indicates a small, or imprecisely measured, effect.

Specification of individual country models

Due to a large number of missing observations (particularly on aid), 13 SSA countries were omitted from this study. Table A1 in the Appendix provides a list of these countries with a brief explanation of the reason for non-inclusion.

Many SSA countries became independent only around 1960, and the first years of transition from colonial to independent statehood and administrations were often volatile and gradual. Moreover, the International Development Association and some of the bilateral donor agencies were only established in the 1960s (Tarp, 2006). In a period where the relationship between aid and the macrovariables has not yet reached its long-run equilibrium, the linear relationship postulated by the VAR model is likely to provide a poor approximation. In such cases, model estimates will often improve when non-representative years are left out. To check this possibility, we applied a test for detecting influential observations described in Nielsen (2008) to the individual country models. For many of these countries the first five years, 1960–65, were singled out as excessively influential and omitted from the analysis. Table reports the choice of sample period for each country.

Omitting the first five observations reduces an already small sample to a size that renders available recursive test procedures for assessing parameter stability powerless. As the VAR model is derived under the assumption of constant parameters, which may not be a plausible assumption for all model parameters over a period of 40–50 years, this is a potential problem. Because parameter instability is frequently associated with periods of political and economic turmoil – such as war, social unrest, severe droughts, interventions and adjustment reforms – we improve parameter stability by controlling for such extraordinary events, using different types of dummy variables. For example, a step dummy DsZZt defined as (0,…,0,0,1,1,1,1…,1) starting in year ZZ, can measure a shift in the equilibrium mean, for example, due to war. If it is restricted to the cointegration relations and the model has two lags, an unrestricted impulse dummy, inline image will automatically enter the model. A permanent impulse dummy, DpZZt, defined as (0,…,0,0,1,0,0,0…,0) or a transitory impulse dummy, DtrZZt, defined as (0,…,0,0,1,−1,0,0…,0) enter the VAR model unrestrictedly. Table 2 reports the type of dummy variables used in each country model.

Table 2. Sample period, lag length, dummy variables and first and second best choice of cointegration rank
Country Sample Lag k Dummy variables Coint. rank
First Second
  1. Notes: inline image has the form (…,0,1,0,−1,0,0,…), inline image and Dtr8900t is defined analogously.

  2. Sources: http://stats.oecd.org/Index.aspx; World Development Indicators (WDI) database; Heston et al. (2009).

Benin1965–20071 inline image 34
Botswana1960–20071 inline image 23
Burkina Faso1965–20072 Dtr71t23
Burundi1962–20071 inline image 34
Ctrl. Afr. Rep.1965–20071 inline image 32
Chad1965–20071 inline image 32
Comoros1970–20071 inline image 32
Rep. of Congo1965–20071 Ds05t23
Djibouti1970–20072 inline image 43
Ethiopia1965–20071 inline image 32
Gabon1965–20021 inline image 34
The Gambia1960–20071 inline image 34
Ghana1966–20072 Ds05t34
Guinea1963–20071 inline image 34
Kenya1965–20071 inline image 32
Lesotho1963–20071 inline image 32
Liberia1970–20071 inline image 21
Madagascar1965–20072 inline image 23
Mali1965–20071 inline image 32
Mauritania1965–20071 Ds92t21
Mauritius1965–20021 Ds76t32
Niger1965–20071 inline image 23
Nigeria1960–20001 inline image 32
Rwanda1960–20071 inline image 32
Senegal1965–20061 Ds69t32
Seychelles1960–20071 inline image 23
Somalia1970–20071 inline image 23
Sudan (WDI)1960–20072 Ds96t32
Tanzania1962–20072 inline image 23
Togo1965–20071 Dp93t23
Uganda1964–20071 inline image 34
Zambia1967–20071 inline image 32

While controlling for the effect of extraordinary events in the long- and short-run structures of our model is likely to improve parameter stability, it does not necessarily solve the problem of poor quality data which may be serious in some cases. We recognize this point up-front but note that these are the available data that have been analysed extensively in the cross-country literature. We also emphasize that our results represent average historical effects of aid over the last 40–50 years in each of the 36 countries rather than deep structural parameters, but we highlight that in contrast with the cross-country literature our estimates of aid impact are indeed allowed to vary from one country to another.

After having accounted for extraordinary events over the sample period, a VAR lag length of k=1 was sufficient to describe the variation in the data for the vast majority of 29 countries. For the remaining seven countries k=2 was sufficient. Table 2 reports the choice of k for each country.

Determination of the cointegration rank

The cointegration rank determines the division into pulling (i.e. the equilibrating) forces and pushing (i.e. the exogenous) forces. The choice of r is, therefore, often crucial for the results. The maximum likelihood test procedure, the so-called trace test (Johansen, 1996) is based on a sequence of tests of the null of pr unit roots for r=0,1,2,…,p−1. As discussed in Juselius (2006, chapter 8.5), some of these null hypotheses may not correspond to plausible economic null hypotheses. In particular, this is often the case for large values of pr (many stochastic trends) and small values of r (few equilibrium relations), as economic theory would a priori predict that aid and the macrovariables are related in the long run. To avoid not rejecting an implausible economic null, just because it happens to correspond to a conveniently testable statistical null, we need to specify in advance an economic prior for the number of autonomous stochastic shocks, pr*, where r* is the number of cointegration relations which are consistent with this prior, and expected to push the system. It would then be justified to test the economic null of pr* stochastic trends using a 5% test combined with a sensitivity check of the closest adjacent alternatives (see Juselius 2006).

In the present study all variables are in real terms. We should therefore expect at least one stochastic trend to originate from cumulated productivity shocks. But foreign aid is in itself sometimes assumed to be exogenous in the system and, hence, could constitute a second driving trend. Thus, the economic prior would in most cases correspond to either {r=3,pr=2} or {r=4,pr=1}. Our results show that the former case is empirically supported for the majority of countries, whereas the latter was found for one country only. A sensitivity analysis suggested that {r=2,pr=3} may be the best choice in 12 cases, whereas {r=1,pr=4} obtained essentially no empirical support (see Table 2).16

The dilemma of testing a statistical null that does not correspond to the economic null is particularly relevant for the rank test. Because of the importance of the choice of rank, Table 2 reports for each country the statistically most credible value of rank, r*, as well as the second best alternative, either r*−1 or r*+1. The choice of r* is based on a variety of statistical criteria, such as the trace test, the largest unrestricted root of the characteristic polynomial for a given r, the t-ratio of the αir coefficients and the graphs of the rth cointegration relation, see juselius (2006) for a more detailed discussion. The reason why we do not exclusively rely on the trace test (as often done in empirical applications) is that it becomes literally uninformative for samples as small as 40–45. In this case the power is often unacceptably low resulting in a failure to reject unit roots even when the alternative is both economically and empirically more plausible. But because the choice of cointegration rank is often everything except unambiguous and the reported results can be sensitive to this choice, we have as noted above chosen to report the P-values not just for the preferred choice of rank, r*, but also for r*+1 or r*−1. This should ensure that the reader gets as much information as possible about the consequences of this important choice.

As already mentioned, neither the choice of full rank (data in levels are stationary) nor zero rank (data are non-stationary but not cointegrated) was supported by the statistical tests. Therefore, assuming a stationary VAR in levels without testing (e.g. because the theory model predicts stationarity), or estimating a stationary VAR in differences (e.g. to get rid of unit roots in the data) is likely to jeopardize the statistical inference. In the former case, standard inference would be incorrect, and in the latter case, valuable long-run information in the data (possibly the only reliable information) would be discarded.

The fact that r=1 was not supported by the statistical evidence is at odds with the frequent use of single equation models in the literature. This is because a single equation model is consistent with just one long-run (cointegration) relation between the included variables (as well as exogeneity of aid). The massive support for r>1 means there are several cointegration relations in development aid data that need to be understood if one is serious about understanding what existing data actually have to tell. We discuss this further in section VIII.

VI. Testing causal links between aid and the macrovariables in the SSA countries

The hypotheses about aid exogeneity, endogeneity and excludability that are associated with the causal links in Table 1 are all testable nested hypotheses in the following sense: Case I, that is, aid is unrelated to the macrovariables inline image is the most restrictive hypothesis. If not rejected with a reasonable P-value, it implies a rejection of at least some aspects of the remaining cases II–IV. If Case I is rejected, but Case II inline image cannot be rejected with a reasonable P-value, it implies a rejection of cases III–IV. If Case II is rejected but Case III cannot be rejected with a reasonable P-value, it implies a rejection of Case IV. If, finally, Case III is rejected, then we end up in Case IV which describes the general case: aid is neither exogenous nor completely endogenous. Shocks to aid are pushing to some extent but macrovariables have also affected aid. This suggests a sequence of testing that starts from the most restrictive hypothesis and ends with the least restrictive one, that is from Case I to Case III. Based on the test outcome each SSA country can be classified according to the causal links diagram in Table 1.

Testing aid ineffectiveness

The purpose of this section is to test whether aid has been completely unrelated to the macrovariables or, alternatively, had no long-run impact on them.

Aid and the macrovariables are completely unrelated

The condition {C21=0 and C12=0} can be tested as the joint hypothesis of long- and short-run exclusion, inline image and strong exogeneity, inline image. For the majority of the SSA countries for which the lag length is one (altogether 29), the test of the above condition corresponds to the joint test of long-run exogeneity, inline image and long-run exclusion, inline image. Table 3 reports the rejection probabilities (P-values) of the joint test.

Table 3. Estimated P-values for the null of no aid effect on the macrovariables
Aid is unrelated with macrovariables (inline image Aid is purely adjusting (H3)
  1. Note: The sample period for Gabon and Mauritius ends in 2002 due to a negative aid entry in 2003.

  2. Source: Authors’ estimations.

  Cointegration rank Cointegration rank
Country r=1 r=2 r=3 r=4 r=1 r=2 r=3 r=4
Benin** 0.20 0.00 0.000.00 0.01 0.01
Botswana* 0.19 0.03 *0.00 0.00 0.09 *
Burkina Faso* 0.05 0.01 *0.03 0.84 0.65 *
Burundi** 0.01 0.00 0.000.00 0.03 *
Cameroon* 0.95 0.01 *0.00 0.10 0.16 *
Ctrl. Afr. Rep.* 0.08 0.03 *0.00 0.00 0.03 *
Chad* 0.30 0.02 *0.00 0.93 0.83 *
Comoros* 0.72 0.12 *0.00 0.00 0.00 *
Republic of Congo* 0.00 0.00 *0.02 0.26 0.15 *
Djibouti** 0.00 0.00 0.010.01 0.09 0.23
Ethiopia* 0.02 0.00 *0.00 0.01 0.01 *
Gabon1** 0.00 0.00 0.000.01 0.08 0.29
The Gambia** 0.00 0.00 0.000.00 0.04 0.36
Ghana** 0.00 0.00 0.000.00 0.06 0.86
Guinea** 0.00 0.00 0.000.00 0.00 0.01
Kenya* 0.00 0.00 *0.01 0.08 0.04 *
Lesotho* 0.06 0.00 *0.00 0.00 0.00 *
Liberia 0.61 0.05 0.09* 0.00 0.00 **
Madagascar* 0.00 0.00 *0.00 0.06 0.28 *
Malawi 0.00 0.00 ** 0.00 0.25 **
Mali* 0.04 0.01 *0.00 0.01 0.00 *
Mauritania 0.00 0.00 ** 0.00 0.01 **
Mauritius1* 0.01 0.00 *0.02 0.01 0.04 *
Niger* 0.27 0.05 *0.00 0.02 0.47 *
Nigeria* 0.02 0.00 *0.00 0.01 0.03 *
Rwanda* 0.02 0.00 *0.00 0.00 0.00 *
Senegal* 0.00 0.00 *0.00 0.00 0.13 *
Seychelles* 0.00 0.00 *0.00 0.00 0.45 *
Somalia* 0.01 0.00 *0.00 0.13 0.57 *
Sudan* 0.21 0.04 *0.00 0.00 0.05 *
Swaziland* 0.12 0.00 *0.02 0.33 0.39 *
Tanzania* 0.70 0.37 *0.00 0.00 0.06 *
Togo 0.13 0.03 0.02*0.00 0.00 0.06 *
Uganda** 0.00 0.00 0.000.00 0.00 0.00
Zambia* 0.00 0.00 *0.01 0.09 0.04 *
Zimbabwe* 0.00 0.00 *0.04 0.05 0.05 *

When interpreting the estimated P-values it should be kept in mind that the power of the joint test to reject an incorrect null is typically related to the number of (significant) parameters being tested. For instance, the hypothesis inline image implies r zero restrictions on the β parameters of the CVAR, and the hypothesis inline image implies p−1 zero restrictions on the Γ1 matrix (when k=2). Thus, the case (p=5,k=2,r=2) corresponds to six restrictions, and the case (p=5,k=1,r=2) corresponds to 2. If the p−1=4 coefficients in Γ21 are not highly significant (which a priori is likely to be the case) then a significant parameter in β can be hard to detect (as many insignificant parameters tend to lower the power of the joint test). For the majority of the countries (29) a lag length of one was sufficient to describe the variation in the data. Thus, low power due to many insignificant coefficients may only be a problem in the remaining few cases.

To provide as much information as possible about the sensitivity of the results to the choice of cointegration rank, we have calculated the P-values of aid ineffectiveness for all possible ranks. To avoid information overflow, we distinguish between empirically plausible and less plausible results by emphasizing the preferred choice of rank, r*, in bold face and the second best choice, either r*+1 or r*−1, in italics. In addition, we have left out the P-values for r<r* or the second best choice. The reason is that the result for the best or the second best choice is overriding the previously obtained result in the following sense: If, for example, aid is only significant in the third cointegration relation, then we should expect high P-values for r=1,2 but a low P-value for r=3. If r*=3, then the result for this case is overriding the previous ones. If, on the other hand, aid adjusts significantly to the first and/or the second cointegration relation, then the P-value for r*=3 would still reflect the previous results.

Table 3 shows that the restriction {C21=0 and C12=0} receives little or no support in the vast majority of the SSA countries. Only in two cases, Comoros and Tanzania, is it possible to obtain fairly strong support for the joint hypothesis. Another two cases, Benin and Botswana, show somewhat more moderate support for the preferred case, but this conclusion is reversed when increasing the rank with one (the second best choice). This leaves Comoros and Tanzania as the only countries for which aid and the macrovariables seem essentially unrelated.17 Of course, this conclusion is based on a fairly restricted information set and it may not be robust to the inclusion of other important omitted variables. A more detailed econometric analysis of the outlying countries would be needed to clarify why these two countries seem to differ from the majority.

Aid is purely adjusting to the macrovariables?

Section IV discussed a procedure for testing the hypothesis that the level of aid has been purely adjusting to the macrovariables implying that shocks to aid have not had any significant long-run impact on the macrovariables. This could, for example, describe a situation where donors routinely allocate aid according to a simple rule involving the macro variables and corrupt government officials use the money for private purposes. But it needs to be emphasized that the test results are not invariant to omitted variables and a failure to reject the hypotheses is evidence of aid ineffectiveness within our specific model. Other relevant variables, if included, may change the test results. With this caveat in mind we shall interpret the test results in Table 3.

For the preferred rank, r* (in bold) the null for aid being unrelated with macroevariables can be safely rejected based on zero or small P-values (0.05 or less) for 31 countries. These results remain reasonably robust between the first and second best (in italics) choice of cointegration rank. In 27 (of the 31) cases the failure to reject the null is in our assessment unaltered as one moves to the second choice of cointegration rank. Only for Cameroon, Chad, Liberia and Sudan is there reasonably strong evidence for non-rejection of the null hypothesis of no long-run effect of aid on the macrovariables for the second choice rank.18

As regards the hypothesis that aid has been purely adjusting to the selected macrovariables there is in our data not much support. The only SSA countries for which there is convincing evidence in favour of accepting this type of aid ineffectiveness seems to be Burkina Faso, Chad and Swaziland.

Testing aid exogeneity

Many empirical studies in the early aid literature are based on regression analysis with aid as the key explanatory variable (Hansen and Tarp, 2000). Such a model choice is implicitly based on the assumption that aid is exogenous to the macrovariables. Because the macroeconomic stance of a developing country is likely to influence the amount of foreign aid allocated by donor countries, aid endogeneity has been recognized in the literature as a potential problem (see e.g. Mosley, 1980) and typically addressed by introducing instrumental variables. Even though good instrumental variables can potentially control for the simultaneity bias, sufficiently strong instruments are difficult to find. This problem can be avoided by estimating a full system of equations as we do in this study. In addition, a system approach allows us to test aid exogeneity using likelihood-based test procedures, thereby checking whether assumptions of aid exogeneity have created an inference problem in the early studies.

Strong exogeneity (inline image corresponds to C12=0 and implies that aid has been unaffected by the macrovariables both in the long- and short-run whereas long-run exogeneity (inline image does not as such imply C12=0. This is because aid in this case is only unaffected by the macrovariables in the long-run, but can be affected by short-run movements in the macro variables. The results in Table 4 are for tests of C12=0, noting that inline image is identical to inline image for the 29 SSA countries with a lag length of one.

Table 4. Estimated P-values for the hypothesis of aid exogeneity
Country k Case Cointegration rank
    r=1 r=2 r=3 r=4
  1. Notes: *The entry ‘0.00’ stands for P-values <0.005.

  2. †The preferred choice of cointegration rank is in bold fact, the second best choice in italics.

  3. ‡For the countries with k=2, the calculations are done in OxMetrics.

  4. Source: Authors’ estimations.

Benin1IV** 0.06 0.00
Botswana1IV* 0.18 0.09 0.12
Burkina Faso2II* 0.00 0.00 0.00
Burundi1IV** 0.00 0.00
Cameroon1IV* 0.96 0.00 0.00
Ctrl. Afr. Rep.1III* 0.15 0.16 0.22
Chad1II* 0.10 0.00 0.00
Comoros1I* 0.42 0.07 0.10
Rep. of Congo1II* 0.00 0.00 0.00
Djibouti2IV** 0.00 0.00
Ethiopia1IV* 0.02 0.01 0.00
Gabon1IV** 0.00 0.00
The Gambia1IV** 0.00 0.00
Ghana2IV** 0.00 0.00
Guinea1IV** 0.02 0.03
Kenya1IV* 0.00 0.00 0.00
Lesotho1III* 0.14 0.22 0.04
Liberia1III 0.92 0.71 0.620.32
Madagascar2II* 0.00 0.00 0.00
Malawi1III 0.28 0.05 0.040.03
Mali1III* 0.22 0.16 0.01
Mauritania1IV 0.08 0.19 0.130.14
Mauritius1IV* 0.05 0.07 0.05
Niger1III* 0.92 0.18 0.01
Nigeria1IV* 0.10 0.00 0.00
Rwanda1IV** 0.00 0.00
Senegal1IV* 0.00 0.00 0.00
Seychelles1IV* 0.00 0.00 0.00
Somalia1II* 0.16 0.04 0.00
Sudan2III* 0.23 0.20 0.11
Swaziland1II* 0.04 0.00 0.00
Tanzania2I* 0.37 0.27 0.15
Togo1III* 0.38 0.40 0.13
Uganda1IV** 0.00 0.00
Zambia1IV* 0.00 0.00 0.00
Zimbabwe2IV* 0.00 0.00 0.00

The exogeneity test is reported for all countries, independently of whether they were already classified as Case II or Case I economies. For Case I economies (i.e. Comoros and Tanzania) the hypothesis of unrelatededness also implies C12=0 and a high P-value does not mean that the previous conclusion of aid unrelatededness has been changed to aid exogeneity. For the six Case II countries we would, however, expect exogeneity to be rejected, and it does. For the preferred choice of rank, strong exogeneity of aid receives little support in the majority (25) of the SSA countries and for the second best choice, the conclusions are basically unchanged. Of the 11 countries for which exogeneity was not outright rejected, only six (Lesotho, Liberia, Mali, Niger, Sudan and Togo) could be safely classified as Case III economies, whereas Botswana, Central African Republic and Mauritania might be accepted based on more moderately sized P-values. Of these, only Central African Republic was classified as a case III economy mostly motivated by the test results in Table 4. Somalia and Malawi are borderline cases which we classified as a Case II economies. Altogether, the conclusion that aid is exogenous for only a small minority of the SSA countries seems reasonably well grounded demonstrating the peril of assuming aid exogeneity without testing.

To sum up, the classification of the SSA countries into our four categories describing different transmission mechanisms between foreign aid and the macrovariables was in most cases reasonably clear, but there were also a few borderline cases where a country could almost equally well have been referred to a different category. The overall conclusion that most of the SSA countries belong to the group of Case IV economies prompts for a more detailed analysis of the long-run impact of aid on individual macrovariables. This is the purpose of the next section.

VII. The long-run effect of aid on individual macrovariables

While the tests in section VI allowed us to classify each SSA-country according to the overall effect of aid, they are uninformative about the sign and magnitude of the individual effects of aid on the individual macrovariables. Obviously, a negative effect of aid, on say GDP or investment, while significant, would not be evidence of aid effectiveness and we also need to discuss the signs and significance of the individual coefficients of C21.

Most studies in the literature discuss the effectiveness/ineffectiveness of aid relative to its ability to enhance growth defined as GDP or investment growth. Our empirical set-up is designed to examine the sign and significance of the estimated long-run impact of aid on these two variables, but also on private and public consumption. While the interpretation of a positive/negative effect on GDP or investment is unambiguous, this is not necessarily the case with a positive effect of aid on government consumption which can be both growth enhancing (if it is associated with expenditure on health and education, say) or growth retarding (if it is associated with corruption/fungibility). Similarly, a positive effect on private consumption can also imply less growth if the increase in consumption is crowding out growth-enhancing investment. To avoid this ambiguity we define our economic prior in terms of the sign and significance of the long-run impact effect of aid on investment and GDP, and we report the results that support our prior based on either the first or second best choice of r.

As before, we need to address the sensitivity of the results to the choice of rank. If the rank is too low some of the assumed stochastic trends are stationary rather than non stationary; if it is too high some of the deviations from a long-run equilibrium relation are sufficiently persistent to be considered non-stationary rather than stationary. In either case, the magnitude, sign and significance of the estimated coefficients of C21 can be influenced, even considerably so. As the first or second best choice of rank is often associated with some ambiguity, reporting the results and conclusions needs to be done cautiously. This problem can be aggravated by the fact that the preferred choice of rank might to some extent be influenced by the researcher's economic prior which may not be openly stated. We address this ambiguity by presenting the results as openly as possible. To achieve maximum transparency, Table C1 in the Appendix reports the estimated asymptotic t-ratios for the coefficients in C21 for the first, second and third best choice of rank. Based on these, the reader can assess/check our conclusions as well as other potentially interesting priors/hypotheses.

The results in section VII are reported from the point of view of a researcher with an economic prior that foreign aid has been effective. As this might potentially introduce a ‘publication bias’ the results will be complemented with a sensitivity analysis, where we ask the question: ‘How robust are the results from the point of view of a researcher with an economic prior that aid is harmful?’

Finally, we emphasize again that our sample size is very small in statistical terms and the asymptotic standard errors based on which these t-ratios are calculated may not closely approximate the correct ones. But even though the t-ratios do not necessarily follow the student's t-distribution they are informative of the relative significance of the estimated long-run effects of aid on the macrovariables.

Assessing the economic prior that aid is effective

We interpret aid to be potentially effective if its long-run impact is significantly positive on either investment, GDP or both. The reported results is for the first best choice of rank if it satisfies this condition, otherwise we check the second best choice of rank and report the results if it supports the aid effectiveness criterion. If neither the first nor the second best choice of rank satisfies the effectiveness criterion, the one which comes closest to showing a positive effect of aid on the macrovariables, for example, positive but insignificant effect, is reported. In this sense Table 5 reports the results from the point of view of a researcher with an economic prior that aid has had positive effects on the macroeconomy.

Table 5. The estimated long-run impact of aid on the macrovariables under the economic prior of aid effectiveness
  1. Notes: The entries refer to the sign and significance of estimated elements of C21. The symbol + or − stands for a t-ratio numerically > 2, +0 or −0 for a numerical t-ratio between 1.6 and 2 and +00 or −00 for a numerical t-ratio below 1.6.

  2. Source: Authors’ estimations based on Table C1 in the Appendix.

 Benin (IV)Botswana (IV)Burkina Faso (II)Burundi (IV)
yt +000+00+00
invt ++00+
ct ++00+00
gt +00+00+
 Cameroon (IV)Ctrl. Afr. Rep. (III)Chad (II)Comoros (I)
yt +00+00+0
invt +00++00
ct 00+00+
gt 00+0000
 Rep. of Congo (II)Djibouti (IV)Ethiopia (IV)Gabon (IV)
yt +000000+00
invt +0+++00
ct 00+00+00
gt +00+++00
 The Gambia (IV)Ghana (IV)Guinea (IV)Kenya (IV)
yt +++00+
invt +++
ct +00+00+
gt +00++00
 Lesotho (III)Liberia (III)Madagascar (II)Malawi (II)
yt ++++
invt +++00+00
ct +++00+
gt ++00+00
 Mali (III)Mauritania (IV)Mauritius (IV)Niger (III)
yt +00+00+
invt ++++
ct +0000+
gt +00++
 Nigeria (IV)Rwanda (IV)Senegal (IV)Seychelles (IV)
yt +0++00
invt +00+++
ct ++++
gt +000+00
 Somalia (II)Sudan (III)Swaziland (II)Tanzania (I)
yt +00+0000
invt +++0+
ct +00+0000
gt 00+00000
 Togo (III)Uganda (IV)Zambia (IV)Zimbabwe (IV)
yt ++00+0
invt +0+0+
ct 0++00+00
gt +++0+

To improve the readability of Table 5, we have indicated significance and sign of a coefficient using the following symbols: + or − implying a t-ratio numerically >2, +0 or −0 a numerical t-ratio between 1.6 and 2, and +00 or −00 a numerical t-ratio below 1.6. The results show that in 27 of our 36 SSA countries aid has had a significantly positive effect on either, investment, GDP or both, when choosing between the first or second best choice of rank. In seven countries the effect of aid on GDP or investment is positive but insignificant and in only two countries, Comoros and Ghana, there is a significantly negative effect. Thus, according to the above criterion there is evidence of aid ineffectiveness only for these two countries. However, one may add that this conclusion is in fact not clear-cut for Ghana since for this country there is a positive effect on GDP that counteracts the negative investment effect.

The results in Table 5 can also be used to check the consistency of the classification into Case I, II, III or IV economies in the previous section. We would, for example, expect countries classified as Case I and II to have insignificant coefficients in C12 whereas countries classified as III and IV to have significant coefficients. Table 6 provides this information by showing how the estimated long-run effects of aid on the four macrovariables are distributed for each category when distinguishing between significance and sign.

Table 6. The number of Case I–IV countries according to sign and statistical significance of the effect of aid on the macrovariables
  1. Source: Table 5.

Case I – 2 countries
Case II – 7 countries
Case III – 7 countries
Case VI – 20 countries

It appears that aid has had a significant effect on investment in 15 out of 20 Case IV countries and in six out of seven Case III countries, but only in two of nine Case I or II countries.19 In 27 countries the effect of aid on GDP is similarly positive and statistically significant in the majority of cases. The effect on private and government consumption is positive but with several insignificant effects. The last column in the table shows that in three cases aid has had a significantly negative effect on private consumption, in two cases on government consumption, in no case on GDP and in two cases on investment (Comoros and Ghana).

The results in Table 5 are also consistent with the overall tests of ineffectiveness. For example, Burkina Faso, Chad and Swaziland classified as clear Case II economies show almost exclusively +00 or 00 entries and, according to Table C1, this is relatively robust to the choice of rank. The fact that there is a significantly positive effect of aid on investment for Tanzania suggests that this effect alone was not sufficiently strong to show up in the joint tests. We also note that a preliminary analysis (not shown) based on the influential diagnostics approach in Nielsen (2008) suggested that the years 1992–95 may have played an important role in Tanzania in accordance with the well-established fact that this period saw strained relations between the Tanzanian government and the donors, which had a strongly negative influence on aid flows. Normal relations were only restored after an agreement was reached with the IMF in 1996 and a new government had taken office. This illustrates that a more detailed investigation is generally needed before one can convincingly argue that aid has had no effect in Case I and II countries. Such further country case research is now underway for Tanzania.

Is the aid effectiveness conclusion robust?

The results so far have provided strong support for the aid effectiveness prior. But this conclusion might have been affected by ‘publication bias’ due to the way we have selected the results. This would indeed be the case if the sign and the significance of the estimated coefficients alternate between the first and second best choice of rank. Table 7 reports the number of countries for which either the positive or the negative aid effectiveness prior is significantly supported by the estimated income or investment coefficient. This is done allowing for three alternative search procedures: (i) only for the preferred rank, (ii) between first or second best choice of rank and (iii) between first, second or third best choice of rank.

Table 7. A sensitivity analysis of the effect of aid on GDP and investment under two different economic priors
  Economic prior 1: Aid is effective Economic prior 2: Aid is harmful
  Number of countries with significantly positive effects Number of countries with significantly negative effects
  Choice of rank Choice of rank
  1st 1st or 1st, 2nd or 1st 1st or 1st, 2nd or
  best 2nd best 3rd best best 2nd best or 3rd best
  1. Source: Authors’ calculations.


The entries in the column for ‘1st best’ under Economic prior 1, show that aid has had a significantly positive effect on GDP in 12 countries and on investment in 15 when considering only the preferred rank (r*) models. In contrast, in only two countries did aid have a significantly negative effect on GDP and investment under Economic prior 2. The entries in the column ‘1st or 2nd best’ are found under a more flexible search algorithm: if the first best rank does not deliver the desired result but the second best does, then we choose the second best. Under the column ‘1st, 2nd or 3rd best’ we extend our search to include also the third best choice of rank.

The results show that if we search for significantly negative effects of aid on investment among the first or second best calculations we will find five such countries, whereas if we search for significantly positive effects we find 24 cases. For GDP the same figures are six and 17, respectively. If the search is between the first, second and third best alternatives, that is, essentially all empirically possible values of r, there are significantly negative aid effects on GDP and investment in only nine respectively seven countries to be compared with 19 and 25 countries having significantly positive aid effects on GDP and investment. Thus, the search for significantly negative investment and GDP effects in all empirically reasonable specifications only produced a few countries where this seemed empirically relevant. In contrast, the significantly positive effects received far more support. Altogether we interpret the results of this section as a strengthening of our previous conclusion that foreign aid has by and large been effective.

Table 7 focussed exclusively on the long-run impact of aid on GDP and investment. Based on Table C1 in the Appendix, it is also possible to study other hypotheses from the point of view of different economic priors. For example, suppose we want to find out whether there is empirical support for the view that foreign aid has primarily gone to private consumption without much improvement of investment and/or GDP.20 When we search among first and second best specifications in Table C1, we find evidence supporting such an outcome only for Benin, Comoros and Mauritania. But if we search among all three specifications, only Comoros remains and if we only allow for the first best specification, Mauritania has a significantly positive investment effect and an insignificant consumption effect, while Benin has a significantly positive investment and private consumption effect. For the majority of countries positive consumption effects of aid are accompanied by positive investment and GDP effects. If the same experiment is conducted with government rather than private consumption, the same picture emerges. In fact, when the choice is between first and second best specifications, a long-run positive impact of aid on government consumption is always accompanied by a positive impact on GDP and/or investment.

We conclude that the aid ineffectiveness view has not received much support in our study and that the more extreme view suggesting that aid is consumed rather than invested has essentially received no support.

VIII. Conclusion

The aim of this study was to provide a broad and statistically well-founded picture of the effect of aid on the macroeconomy of 36 SSA countries. Applying our cointegrated VAR model to each of these countries, we found convincing support for the hypothesis that aid has had a positive long-run impact on investment and GDP in the vast majority of cases, and almost no support for the hypothesis that aid has had a negative effect on these variables. In 27 of our 36 SSA countries aid has had a significantly positive effect on either, investment, GDP or both. In seven countries the effect of aid on GDP or investment is positive but insignificant, and only in two countries, Comoros and Ghana, is one of them significantly negative. Thus, only for these two countries is there evidence of aid ineffectiveness when one departs from an ‘aid is effective’ economic prior. In addition, (extreme) fungibility meaning that aid increases consumption and has a negative effect on investment and/or GDP found no empirical support in our analysis.

When we depart from the ‘aid is harmful prior’ the difference in empirical support between extreme views of aid effectiveness is striking. In this case, we find only nine and seven countries, for which there is a significant negative effect on GDP and investment, respectively. This is to be compared with the 19 and 25 cases, respectively, for which there is a significant positive effect when the economic prior ‘aid is effective’ is tested. Moreover, we highlight that for statistically more reliable values of the first and perhaps second best, this difference is even more pronounced. In sum, when searching for significantly negative investment and GDP effects for all empirically reasonable specifications, there is little to point to. Positive significance receives far more support. This is noticeable given that the data are still weak.

How robust are these results to the ceteris paribus clause? As a general rule, the classification of a country may change if we add other relevant variables to the model. Therefore, as was emphasized when discussing the empirical results, they are strictly valid only within the context of our system consisting of aid and the four macrovariables, which cover a range of empirical specifications in the literature. It seems unlikely that the conclusions would change dramatically by adding new domestic variables. But since the present variables define a closed economy, adding open-economy variables like the real exchange rate may change the results. To investigate whether this is the case is the purpose of future work. Similarly, we recognize the need for coming better to grips with the role of non linear co-integration. Given our small sample sizes we could not estimate any threshold for aid's long-run impact on macrovariables, but we recognize the need for more work on this topic.

Overall, our country-based study leads to the following additional more specific conclusions:

  • (i) The importance of adequately accounting for non-stationarity and cointegration is critical. Trend-stationarity of aid and the macrovariables was rejected for all SSA countries. Cointegration is highly significant, and our sensitivity analyses and robustness checks demonstrate that the choice of cointegration rank can be qualitatively crucial for the conclusions reached based on the tests applied to the SSA countries. The use of single equation modelling, which was particularly common in the early aid-growth literature, is on this basis very circumscribed. It requires that the cointegration rank must be one and that aid is exogenous. We found the cointegration rank to be either 2 or 3 (out of a maximum of 5) in essentially all SSA countries and aid to be exogenous in only seven countries. Since exogeneity testing is optimally done within a system of equations, any continued preference for the more restrictive single equation approach is hard to justify.
  • (ii) The common practice of imposing (untested) parameter restrictions implied by various data transformations can be problematic. When tested, these restrictions were generally rejected and they often matter for the conclusions drawn.
  • (iii) It is critical to account for changes in political government, changes in conditionality conditions imposed by the IMF, major adjustment reforms as well as natural catastrophes, such as droughts and floods. Without including these events in the modelling, inference would have been totally unreliable in many cases. The fact that such extraordinary events are generally not controlled for in the literature suggest great care should be exercised before policy recommendations are drawn up.
  • (iv) While the overall qualitative conclusions with respect to aid effectiveness were rather similar for the vast majority of countries, SSA countries have been quite heterogeneous with respect to the transmission of aid on macrovariables. For example, we found that the exogenous shocks that have pushed the system out of equilibrium and the cointegration relations that have pulled it back again frequently differed as to their number and origin across the countries. Considering that aid is often given for different purposes in different countries, this should come as no big surprise. As panel data analyses are implicitly or explicitly based on an assumption of homogeneous countries across the panel, we reiterate that panel data studies should not be used as a basis for drawing up relevant policy advice in individual countries.

Whether aid has worked or not for development has over the years been associated with many perceived paradoxes and dilemmas. One example is the micro–macro paradox due to Mosley (1980, 1987), which suggests that aid is ineffective at the macro level. Our study reinforces the emerging professional consensus that there is indeed no paradox in practice. The economics profession may instead have been excessively preoccupied with econometric paradoxes due to the fact that data and methodological tools have only been gradually improving, in parallel with the much greater care that influential studies should of course be associated with. Our study in which we started from an explicit stochastic formulation of all variables without constraining them in pre-specified directions, stands for example in marked contrast to Dollar and Easterly (1999). They regressed the investment-to-GDP ratio on the ODA-to-GDP ratio based on essentially the same kind of data.21 They found as already alluded to a significantly positive effect of aid on investment in only eight of 34 cases. This may be compared with 25 of 36 countries here. We note that the data transformations in Dollar and Easterly (1999) are based, critically, on an implicit assumption of homogeneity. It was generally rejected when tested. Their bivariate regression model effectively assumes just one relation between the variables (ODA, GDP and investment) and ignores any potential endogeneity between aid and the macrovariables. Both assumptions were found here to be inconsistent with the information in the SSA data. Also, inference on their key parameter is conducted under the assumption that investment-to-GDP and the ODA-to-GDP ratios are stationary. When tested, stationarity was empirically rejected for most countries. The fact that Dollar and Easterly (1999) used non-logged data is likely to have increased the non-stationarity of the ratios. We found that there is substantial support for a log-specification as the statistically preferable option.

In sum, the aim of this paper was to learn more about how aid impacts on macroeconomic variables in SSA. We have found what we see as surprisingly strong evidence in favour of the thesis that aid works. Nevertheless, we stress in conclusion that the evidence is not perfect. There are some cases where aid does not seem to have worked given the nature of the evidence in hand at present. We suggest that they merit careful deeper analysis. Moreover, we were able to include only four macrovariables to represent the macroeconomy. This means that further work is needed to capture more convincingly the deeper country context Riddell (2007) refers to. In other words, further disaggregation would clearly be desirable to tease out more detailed stories as already Papanek (1972) pointed out.


  • 1

    See Hansen and Tarp (2001) for a critique of Burnside and Dollar (2000), Arndt et al. (2010) for a re-examination of Rajan and Subramanian (2008) and Mekasha and Tarp (2011) for a rebuttal of Doucouliagos and Paldam (2008), on exactly these grounds.

  • 2

    The ODA data with complete documentation are available from the homepage of the OECD. Net disbursements are defined as the sum of grants, capital subscriptions and net loans. As net loans are net of repayments, ODA net disbursements can be negative.

  • 3

    Note that, WDI covers less than half of the countries studied here. The WDI data base is available at: http://data.worldbank.org/data-catalog/world-development-indicators.

  • 4

    See for example, Dollar and Easterly (1999, pp. 548–549), and Easterly (1999).

  • 5

    See for example, Burnside and Dollar (2000).

  • 6

    For a review of the literature on fungibility and fiscal response see for example, McGillivray and Morrissey (2004).

  • 7

    Test procedures that formally check for these problems generally support logs rather than levels (e.g. Kobayashi and McAleer, 1999; Ermini and Hendry, 2008, Spanos et al., 2008).

  • 8

    Computations based on the non-log model can be obtained from the authors.

  • 9

    An anonymous referee has aptly made the point that there may well be situations where aid needs to hit a threshold before having a long-run impact on macro-variables. We cannot estimate such a threshold with the available sample size so we decided to focus on ‘one regime’.

  • 10

    The φ coefficients can then be computed as inline image and inline image.

  • 11

    The latter is in turn supposed to influence GDP per capita.

  • 12

    A variable yt is ‘Granger caused’ by aidt if the change in Δyt can be explained by ‘lagged changes in aid’, Δaidti, but not the other way around.

  • 13

    This is also called weakly exogenous in the econometrics literature.

  • 14

    This was broadly supported in all empirical models.

  • 15

    All estimation results have been obtained by the software packages CATS in RATS, Dennis, Hansen and Juselius (2006) and OxMetrics (Doornik and Hendry, 2001).

  • 16

    The documentation for this (including programme code for CATS for each country) can be obtained from the authors upon request.

  • 17

    A referee aptly remarked that this anomalous finding for Tanzania does accord with the economic goals of country over the sample period where explicit equity objectives were pursued in contrast to growth objectives.

  • 18

    A referee pointedly noted that for Cameroun the difference between the first choice rank (which strongly rejects the null) and the second choice rank (which strongly cannot reject the null) is large relative to the other country differences. We agree and note that Cameroon is a very complex economy due to its structure, its dependency on oil, a particular policial history and substantial variation in policy regimes. Our results are on this background not so surprising and also methodologically perfectly possible given our data. We fully agree with the referee that it would be relevant to dig deeper into this case and carry out a detailed country case study to be able to put forward a comprehensive and well-grounded assessment.

  • 19

    That aid has a positive effect on investment in most cases is consistent with the findings of for example, Gomanee et al. (2005). In contrast, Boone (1996) and Dollar and Easterly (1999) generally find no or little evidence of a positive investment effect (see below).

  • 20

    For investment, such an outcome may result from public investment crowding out private investment fully or more than that, respectively.

  • 21

    Their sample, 1965–95, is however shorter.


Appendix A: List of excluded SSA-Countries

Table A1. Countries not included in the study
Country Miss. aid obs. Reasons for non-inclusion
  1. Note: See the homepage of Penn World Tables (http://pwt.econ.upenn.edu/php_site/pwt_index.php) under ‘Old Documentation’.

Angola2Data for the macro variable start in 1970 and aid reaches reasonably high levels only from 1977
Cape Verde11Missing aid data
Congo (Dem. Rep. of)0Poor data quality for the macrovariables
Cinline imagete d'Ivoire0A fundamental structural break around 1980
Equatorial Guinea13Missing aid data
Eritrea33Missing aid data
Guinea-Bissau8Missing aid data
Mayotte17Missing aid data
Mozambique1The aid data were roughly zero until 1975
Namibia23Missing aid data
Sao Tome and Principe10Missing aid data
Sierra Leone0Negative investment data (see the note below).
South Africa33Missing aid data

Appendix B: Comparing log versus non-log aid specifications

Table B1. Comparing residual-based misspecification tests when using log-aid versus non-log-aid
  Auto corr. Norm. Hetero. R2
  1. Notes: ‘log’ versus ‘non-log’ indicates which of the aid specification is preferred, no entry means equally adequate specifications.

  2. Source: Authors’ comparisons.

Burkina FasoLogLog
Ctrl. Afr. Rep.
Rep. of CongoNon-log
GabonOnly non-log possible for the full sample
MauritiusOnly non-log possible for the full sample

Appendix C: The t-ratios of the elements of C21 for different choices of rank

Table C1. The t-ratios of the elements inC21for the first best choice of rank with second and third best choice in brackets
  Benin Botswana Burkina Faso Burundi
  1. Note: In almost all cases the third best choice of r is within the range r*±1.

  2. Source: Authors’ calculations.

yt 1.09 (−2.40,1.00)−1.76 (1.78,−2.52)0.73 (−1.86,−0.56)0.11 (−3.07,2.79)
invt 3.02 (−2.40,−8.49)3.24 (0.62,1.88)−0.55 (−1.39,0.51)3.81 (3.07,3.98)
ct 3.74 (2.40,−0.15)−4.73 (1.37,−5.39)0.92 (−1.51,0.45)0.52 (−3.07,0.64)
gt −2.87 (−2.40,8.41)0.87 (1.94,−0.81)0.14 (1.00,0.56)3.74 (3.07,6.56)
 CameroonCtrl. Afr. Rep.ChadComoros
yt −0.43 (−1.80,−2.02)2.14 (2.14,2.06)−0.44 (−0.78,−0.44)1.70 (−1.40,6.29)
invt 0.74 (0.11,0.48)2.65 (0.57,2.78)0.54 (0.44, 0.44)−3.27 (−0.59,−6.29)
ct −0.03 (−1.60,−1.85)2.71 (3.15,1.20)−0.74 (−0.92,−0.44)6.60 (10.30,6.29)
gt −0.32 (−3.93,−3.06)4.87 (7.11,11.37)−0.13 (−0.49,−0.44)−0.68 (−2.38,6.29)
 Rep. of CongoDjiboutiEthiopiaGabon
yt −0.47 (0.99,−1.65)1.45 (−1.34,−1.29)−2.11 (−0.52,0.52)0.25 (1.02,−1.32)
invt 1.81 (1.85,2.24)1.45 (2.14,0.35)1.42 (3.18,0.52)0.01 (1.02,−1.15)
ct −0.56 (−0.12,−0.92)1.45 (0.52,0.56)−1.70 (−2.21,0.52)−2.78 (1.02,−3.08)
gt 0.67 (0.90,−2.33)1.45 (2.14,3.50)−3.38 (2.38,0.52)1.22 (1.02,−0.64)
 The GambiaGhanaGuineaKenya
yt 4.57 (1.89,3.56)2.71 (−0.21,3.35)1.27 (−4.57,−2.75)3.87 (1.58,0.71)
invt 2.91 (1.89,0.16)−4.11 (−0.21,−3.18)4.38 (4.57,5.01)3.20 (2.93,3.17)
ct 4.21 (1.89,3.90)−0.05 (−0.21,−0.22)0.35 (−4.57,−1.81)3.67 (2.05,−0.68)
gt 3.49 (1.89,2.01)−0.77 (−0.21,−2.24)4.54 (4.57,6.05)0.62 (1.07,1.04)
yt 2.37 (2.79,−0.69)−3.38 (5.02,−1.49)2.19 (1.59,2.48)2.82 (−4.36,2.35)
invt 2.21 (2.65,0.69)−0.48 (3.44,−0.56)0.25 (−0.02,1.44)0.97 (1.98,1.49)
ct 3.30 (3.39,−0.69)−8.06 (3.24,−1.49)1.50 (0.45,−0.77)2.59 (−1.93,2.30)
gt 2.27 (1.60,−0.69)−1.67 (6.52,−0.24)−0.67 (−1.23,−2.05)0.62 (1.13,1.33)
yt 3.03 (0.21,−0.96)−0.96 (0.63,2.42)0.04 (0.71,0.61)0.94 (2.57,1.31)
invt 6.54 (7.09,6.44)3.17 (−2.90,2.42)3.92 (4.08,2.12)1.06 (5.12,1.31)
ct −4.95 (−5.38,0.13)0.66 ( 4.22,2.42)−1.39 (−0.75,−1.32)−0.98 (3.65,−1.31)
gt 1.17 (0.83,3.84)2.93 (1.34,2.42)−4.34 (−1.35,−1.94)1.40 (2.83,1.31)
yt 1.95 (3.82,0.00)0.98 (3.38,6.24)2.02 (4.04,1.60)−1.06 (−0.08,−0.99)
invt 1.45 (−1.60,−0.00)2.16 (3.38,3.28)0.42 (2.63,−1.60)0.71 (3.71,0.87)
ct 2.11 (1.47,0.00)3.27 (3.38,1.01)2.07 (2.25,1.60)6.10 (2.06,0.83)
gt 1.51 (−6.06,0.00)−1.81 (−3.38,4.72)2.20 (3.90,1.60)0.18 (−0.79,0.92)
yt 0.58 (−0.22,1.32)2.62 (1.98,2.56)−0.09 (0.84,0.35)−1.17 (−1.43,−1.30)
invt 2.23 (0.06,−5.51)3.59 (2.69,2.56)1.60 (−2.44,−0.35)0.36 (3.25,−1.42)
ct 0.99 (0.07,4.31)4.07 (1.64,2.56)−0.36 (−2.27,0.35)−0.03 (−1.28,−1.17)
gt −0.90 (−1.21,2.28)1.48 (0.96,−2.56)−1.67 (−1.24,0.35)−0.87 (−0.10,0.33)
yt 3.69 (2.03,0.43)1.21 (2.71,−2.58)−0.13 (−0.14,0.23)0.17 (1.90,2.23)
invt −0.95 (1.81,−6.51)1.23 (2.71,1.51)−1.70 (−1.96,0.23)−1.71 (2.33,2.23)
ct 4.95 (−1.63,0.39)0.44 (2.71,−3.63)1.41 (1.01,0.23)0.98 (1.31,2.23)
gt 2.19 (3.76,1.87)1.48 (2.71,−0.91)1.61 (0.55,0.23)1.83 (2.32,2.23)