Luisa Corrado acknowledges the support of the Marie-Curie IE Fellowship 039326 and the Marie Curie Excellence Award 2007.

# WHERE IS THE ECONOMICS IN SPATIAL ECONOMETRICS?*

Article first published online: 11 APR 2011

DOI: 10.1111/j.1467-9787.2011.00726.x

© 2011, Wiley Periodicals, Inc.

Additional Information

#### How to Cite

Corrado, L. and Fingleton, B. (2012), WHERE IS THE ECONOMICS IN SPATIAL ECONOMETRICS?. Journal of Regional Science, 52: 210–239. doi: 10.1111/j.1467-9787.2011.00726.x

^{†}

#### Publication History

- Issue published online: 19 APR 2012
- Article first published online: 11 APR 2011
- Received: July 2010; revised: November 2010; accepted: November 2010.

- Abstract
- Article
- References
- Cited By

### Abstract

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. W WITHIN SPATIAL ECONOMETRIC SPECIFICATIONS
- 3. PUTTING SOME ECONOMICS INTO W
- 4. DO WE NEED W?
- 5. CHOOSING W EMPIRICALLY
- 6. ALTERNATIVES TO W
- 7. HIERARCHICAL MODELS
- 8. MODELING SPATIAL EFFECTS THROUGH THE ERROR TERM
- 9. LINKING SPATIAL MODELS TO HIERARCHICAL MODELS WITH INTERACTION EFFECTS
- 10. CONCLUSIONS
- REFERENCES

**ABSTRACT** Spatial econometrics has been criticized by some economists because some model specifications have been driven by data-analytic considerations rather than having a firm foundation in economic theory. In particular, this applies to the so-called **W** matrix, which is integral to the structure of endogenous and exogenous spatial lags, and to spatial error processes, and which are almost the *sine qua non* of spatial econometrics. Moreover, it has been suggested that the significance of a spatially lagged dependent variable involving **W** may be misleading, since it may be simply picking up the effects of omitted spatially dependent variables, incorrectly suggesting the existence of a spillover mechanism. In this paper, we review the theoretical and empirical rationale for network dependence and spatial externalities as embodied in spatially lagged variables, arguing that failing to acknowledge their presence at least leads to biased inference, can be a cause of inconsistent estimation, and leads to an incorrect understanding of true causal processes.

### 1. INTRODUCTION

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. W WITHIN SPATIAL ECONOMETRIC SPECIFICATIONS
- 3. PUTTING SOME ECONOMICS INTO W
- 4. DO WE NEED W?
- 5. CHOOSING W EMPIRICALLY
- 6. ALTERNATIVES TO W
- 7. HIERARCHICAL MODELS
- 8. MODELING SPATIAL EFFECTS THROUGH THE ERROR TERM
- 9. LINKING SPATIAL MODELS TO HIERARCHICAL MODELS WITH INTERACTION EFFECTS
- 10. CONCLUSIONS
- REFERENCES

The critique of spatial econometrics emanating from some economists is, we assert at the outset, based on imprecise and ill-informed perceptions of the sophistication and diversity of the work of the spatial econometrics and wider academic community. The argument is that standard spatial econometrics is typically applied in a mechanical fashion, variables are introduced simply because they are significant, without *a priori* rationale, spatial econometricians often work in isolation from urban economists and other regional scientists, and overall there is a lack of theoretical justification for variables that characterize spatial econometric models.^{1} An engagement with the literature shows this to be a misrepresentation. There are numerous examples of single equation cross-sectional spatial econometric models, multiequation specifications, and panel models incorporating spatial effects, in which economic theory is fundamental to the specification of the reduced form, including specifications based on neoclassical growth theory (Fingleton and Lopez-Bazo, 2006; Ertur and Koch, 2007), urban economics (Fingleton, 2003; Barde, 2010), and on the wage equation from new economic geography (Fingleton, 2006). Almost invariably, these specifications are elaborations of mainstream theory incorporating externalities in the form of spatial spillovers, being characterized by the presence of the trademark component of the spatial econometric model, namely the spatial lag, which can be considered to be the *sine qua non* of spatial econometrics. While the theory underlying these models is often exceptionally well established and well received, nevertheless it is also true that there are cases in which spatial econometric work has been too casual in its attempt to base model specifications on economic theory. Our first main contribution is to highlight this criticism. Often there is no attempt to make theory testing a focal point of the research, and too often we see an emphasis on diagnostics and empirical model validity as the most important criteria to be used to signify a good model, without any attempt to relate to real or theorized economic processes and mechanisms. Most significantly, when it comes to the spatial lag, which is based on the so-called **W** matrix of spatial weights, many economists are skeptical, puzzled, or both. The values in the cells of **W** comprise an explicit hypothesis about the strength of interlocation connection (typically towns, regions, or countries), and in many cases the matrix product of **W** and endogenous variable **Y,** namely the endogenous spatial lag **WY**, often proves to be a highly significant variable. It has been suggested by the skeptics that the significance of explanatory variable **WY** may be misleading, since it may be simply picking up the effects of omitted spatially dependent variables, **WX**, incorrectly suggesting the existence of a spillover mechanism.

One way out of the **W** matrix conundrum may appear to be via hierarchical modeling (HLM), which has had only very limited exposure in the spatial economics literature (exceptions being Smith and LeSage, 2004; Parent and LeSage, 2008). Hierarchical models (also known as multilevel models) are becoming increasingly popular across the range of the social sciences, as researchers come to appreciate that observed outcomes depend on variables organized in a nested hierarchy.^{2} In regional science and spatial economics, we can often envisage a hierarchy of effects from cities, regions containing cities, and countries containing regions. Failure to recognize these effects emanating from different hierarchical levels can lead to incorrect inference. The second major contribution of this paper is to point this out and to show that something equivalent to **W** is very much a cognate part of a hierarchical approach, with an enhancement to HLM coming by way of incorporating spatial effects via **W**. We demonstrate that the inclusion of interdependence between groups in the form of spatial effects, **WX,** has two main advantages: (i) it avoids the omitted variable problem that may afflict models with endogenous spatial lags and (ii) it introduces a source of exogenous variation that allows the identification of both endogenous and exogenous group effects.

### 2. W WITHIN SPATIAL ECONOMETRIC SPECIFICATIONS

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. W WITHIN SPATIAL ECONOMETRIC SPECIFICATIONS
- 3. PUTTING SOME ECONOMICS INTO W
- 4. DO WE NEED W?
- 5. CHOOSING W EMPIRICALLY
- 6. ALTERNATIVES TO W
- 7. HIERARCHICAL MODELS
- 8. MODELING SPATIAL EFFECTS THROUGH THE ERROR TERM
- 9. LINKING SPATIAL MODELS TO HIERARCHICAL MODELS WITH INTERACTION EFFECTS
- 10. CONCLUSIONS
- REFERENCES

The square matrix **W** is of dimension *N*, where *N* is the number of nodes in a network, with the value in typical cell *W _{jh}* quantifying the hypothesized strength of interaction between nodes

*j*and

*h*. Here we stress that nodes need not necessarily be places in order to draw on the wider literature that provides additional support to the concept of network interaction. Typically all of the diagonal elements of

**W**are zero, and

**I**−ρ

**W**is nonsingular. Also, following Kapoor, Kelejian, and Prucha (2007) and Kelejian and Prucha (1998),

**W**should be uniformly bounded in absolute value, meaning that a constant

*c*exists such that and so as to produce asymptotic results required for consistent estimation.

In a single-equation context, multiplying **W** by a *N*× 1 vector (dependent variable) **Y**, gives the endogenous spatial lag **WY**, which is an integral component of numerous spatial econometric models. However given the *N*×*k* matrix of variables **X**, additional spatially lagged variables can be introduced, forming the columns of the matrix **WX**. Also we can add a hypothesis that the errors may be spatially dependent. Following Anselin, Le Gallo, and Jayet (2007), who write from a spatial panel data perspective, there are four ways we might wish to model spatial effects operating through the error term, namely (i) direct representation, which originates from the geostatistical literature (Cressie, 2003); as noted by Anselin (2003), this requires exact specification of a smooth decay with distance and a parameter space commensurate with a positive definite error variance–covariance matrix. Alternatively, as in Conley (1999), a looser definition of the distance decay may be implemented, leading to nonparametric estimation; (ii) spatial error processes typified by much work in spatial econometrics (Anselin, 1988), based on a matrix, say **M**, which is *N*×*N* with similar properties to **W** and which may or may not be the same as **W,** defining indirectly the spatial structure of the nonzero elements of the error variance–covariance matrix. Like **W**, the **M** matrix comprises non-negative values representing the *a priori* assumption about interaction strength between location pairs defined by specific rows and columns of **M**, normally with zeros on the main diagonal. (iii) Common factor models originating from the time series literature (Kapetanios and Pesaran, 2005; Pesaran, 2007; Hsiao and Pesaran, 2008) and (iv) spatial error components models (Kelejian and Robinson, 1995; Anselin and Moreno, 2003), which we consider subsequently.

Allowing endogenous and exogenous spatial lags, plus a spatial error process, and assuming nodes are linked by network dependence matrix **W** for the lags and by **M** for the error process, and assuming autoregressive processes, the typical single equation spatial econometric model specification, the spatially autoregressive model with autoregressive disturbances (SARAR) model, is

- (1)

- (2)

- (3)

In (1), ρ is a scalar coefficient, and _{x} are *k*× 1 vectors of coefficients, and is an *N*× 1 vector of disturbances. For the error process, we have the scalar λ and the *N*× 1 vector of innovations drawn from an *iid* distribution with variance σ^{2}.

We extend the scope of the spatial econometric models in two ways, first by introducing the time dimension, thus allowing panel data models with network dependence, and secondly by considering multilevel models (see Corrado and Fingleton, 2010).

Consider a simple random effects panel specification for time *t*= 1 , … , *T* and for individual *i*= 1 , … , *N* given by

- (4)

with ν_{it}∼*iid*(0, σ^{2}_{ν}) and μ_{i}∼*iid*(0, σ^{2}_{μ}), which can be rewritten as

- (5)

where *Y _{it}* is individual

*i*'s response at time

*t*,

*X*is the exogenous variable, μ

_{it}_{i}is an error specific to each individual, and ν

_{it}is a transient error component specific to each time and each individual. We can introduce spatial effects both as an endogenous spatial lag

- (6)

and as an autoregressive error process

- (7)

and generalizing to *k* regressors in the panel context this becomes

- (8)

in which **Y** is a *TN*× 1 vector of observations obtained by stacking *Y _{it}* for

*i*= 1 …

*N*and

*t*= 1 …

*T*,

**X**is a

*TN*×

*k*matrix of regressors and is a

*k*× 1 vector of coefficients.

In addition, given a *TN*×*TN* identity matrix with 1*s*, the *NT*× 1 vector **e** is

- (9)

in which is an *NT*× 1 vector of innovations, λ is a scalar parameter, and **M** is an *N*×*N* matrix with similar properties to **W**. Regarding the error components in space–time, time dependency is introduced into the innovations via the permanent individual error component μ, thus,

- (10)

- (11)

- (12)

so that is an *N*× 1 vector of random effects specific to each individual, is the transient error component comprising an *NT*× 1 vector of errors specific to each individual and time, _{T} is a *T*× 1 matrix with 1s, and _{T}⊗**I**_{N} is a *TN*×*N* matrix equal to *T* stacked matrices. The result is that the *TN*×*TN* innovations variance–covariance matrix _{ξ} is nonspherical. Also σ^{2}_{1}=σ^{2}_{υ}+*T*σ^{2}_{μ}. Note that this differs from the specifications given by Anselin (1988) and Baltagi and Li (2006), where the autoregressive error process is confined to . In contrast, the Kapoor et al. (2007) set-up assumes that the individual effects have the same autoregressive process.

### 3. PUTTING SOME ECONOMICS INTO W

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. W WITHIN SPATIAL ECONOMETRIC SPECIFICATIONS
- 3. PUTTING SOME ECONOMICS INTO W
- 4. DO WE NEED W?
- 5. CHOOSING W EMPIRICALLY
- 6. ALTERNATIVES TO W
- 7. HIERARCHICAL MODELS
- 8. MODELING SPATIAL EFFECTS THROUGH THE ERROR TERM
- 9. LINKING SPATIAL MODELS TO HIERARCHICAL MODELS WITH INTERACTION EFFECTS
- 10. CONCLUSIONS
- REFERENCES

The suggestion that spatial econometrics may have been somewhat “mechanical” in its application is undoubtedly true in instances where little care has been taken regarding the theoretical basis of the model specification. In this section, we seek to show that in many applications of spatial econometrics, considerable attention has been given to specifying the matrices **W** and **M** in a rational manner that attempts to represent as far as possible real social and economic processes. Hence, we argue that in many cases the presence of a spatial lag, say **WY**, is necessary because it does reflect a true interaction and is not simply a surrogate for some omitted variables. In fact, it appears that an absence of detailed consideration for the structure of **W,** or indeed rejection of an approach based on **W**, has come from analysts who are not particularly interested in the spatial processes *per se*, but see spatial dependence as a something of a nuisance that, yes, has to be allowed for in modeling but which is not the focal point of their research. In this case, spatial error dependence can be treated in very general terms, say by spatial heteroscedastic autocorrelation consistent (SHAC) estimation or by common factor approaches.

We first focus on the very basic form of **W** matrix, which has close links with time series analysis. In fact, it is easy to show that an autoregressive process in time has an equivalent to the **W** matrix, as demonstrated in Fingleton (2009).

Consider the **W*** matrix, in which *W**_{jh}= 1 if location pair *i* and *j* are close to each other in space, and *W**_{jh}= 0 otherwise. By close, we typically mean contiguous. For time series we have a comparable contiguity matrix, let us call it **H**. To see the near equivalence of **H** and **W***, consider a data-generating process for *T* periods,

- (13)

This generates a stationary time series. In matrix terms, an entirely equivalent data-generating process is given by

- (14)

in which **Y** is a *T*× 1 vector, α is a scalar parameter, is a *T*× 1 vector of disturbances, and **H** is a *T*×*T* matrix with row *t* designating the same time point as column *t* and 1s indicating time proximity. In spatial series, **W*** is *N*×*N*, where *N* is the number of places or nodes on the network. Below we show a typical **W*** matrix for *N*= 10, and its time series counterpart **H**

Figures 1–3 show small hypothetical networks that can be represented as simple binary-valued **W***matrices

Typically in practice one would scale the matrix **W***, say by the maximum eigenvalue, since this clearly indicates ‘allowable’ values that avoid singularities. One way to do this is by

- (15)

in which max (*eig*) denotes the maximum eigenvalue of **W***, where **W*** is such that real eigenvalues can be obtained. Using this normalization, the maximum eigenvalue of **W** is 1, and the continuous range for ρ in which (**I**−ρ**W**) is nonsingular is . An alternative to this, due to Ord (1975), is

- (16)

in which the diagonal matrix **D** takes values equal to the row sums of **W***. Most applications in spatial econometrics scale the individual rows (or columns) of **W*** by the row totals, so that the **W** rows sum to 1. In both these alternatives, the same conditions apply for nonsingularity. With **W** thus defined, the spatial data generating process is

- (17)

which is identical to the time series data generating process, except that **W** replaces **H** and ρ replaces α. As ρ 1, we approach near unit root spatial autoregressive (SAR) processes, with similar consequences as in nonstationary time series.

While we have given some emphasis to the similarity of the DGP for autoregressive time series and spatial series, there are major dissimilarities on account of the multilateral nature of spatial interaction. In contrast, with time there are only two directions, forward and backwards. One of the contributions of Cliff and Ord (1973, 1981) was to extend the definition of the matrix **W** to accommodate the distinctiveness of spatial processes. Spillovers have diverse origins, and therefore one would anticipate that the way to model them takes on various forms. For instance, they may be the outcome of network economics Goyal (2009), commuting,^{3} migration, displaced demand and supply effects in the housing market, input–output linkages, competition and coordination between firms, localized information flows through social networks, strategic interaction between policy makers,^{4} tax competition between local authorities, or even simply arbitrary boundaries causing spatial autocorrelation. We cannot deal with all of these cases, so in the following paragraphs we focus on a selection, commencing with the traditional distance-based unidimensional measures adopted in spatial econometrics (Anselin, 1988) and introduce other multidimensional measures based on various notions of social or economic distance. Typically, isotropy is assumed, so that only distance between *j* and *h* is relevant, not the direction *j* to *h*. These may provide the basis for direct or indirect estimation of the error variance–covariance matrix, including the spillover in error components models.

Moving beyond crude measures of between-group spatial “distance,” such as the simple notions of proximity and contiguity, leads us to slightly more elaborate specifications, which nonetheless are still based on the physical features of geographical units. For example, Cliff and Ord (1973) combine distance and length of the common border between contiguous spatial units thus,

- (18)

where *d _{jh}* denotes the distance between locations

*j*and

*h*and χ

_{jh}is the proportion of the boundary of

*j*shared with

*h*whereas

*a*and

*b*are parameters. More general distance measures include multidimensional indicator functions. For example, Bodson and Peters (1975) use a general accessibility weight (calibrated between 0 and 1), which combines in a logistic function several channels of communication between regions such as railways, motorways etc.:

- (19)

where *p _{j}* indicates the relative importance of the means of communication

*j*. The sum is over the

*J*means of communication with

*d*equal to the distance from

_{jh}*j*to

*h*;

*a*,

*b*, and

*c*are parameters that need to be estimated.

_{j}The above measures are less useful when the spatial interaction is determined by purely economic variables, which may have little to do with spatial configuration of boundaries or geographical distance *per se*. This introduces the notion of economic distance, and developments in the conceptualization of economic distance have been surveyed in Greenhut, Norman, and Hung (1987). According to Fingleton and Le Gallo (2008)“the spillover between areas will not simply be a function of spatial propinquity, to the exclusion of other effects” and “it is more realistic to base it on relative ‘economic distance.’ Big towns and cities are less remote than their geographical separation would imply, whereas very small locations are often isolated from one another.” Hence, economic distance reflects the reduced transaction costs associated with flows between geographically remote cities, which have better communications infrastructure, lower costs of information gathering and uncertainty, and similar economic and employment structures. Economic distance features in the work of Conley (1999), Pinkse, Slade, and Brett (2002), Conley and Topa (2002), Conley and Ligon (2002), and Slade (2005). For example, Conley and Ligon (2002) estimate the costs of moving factors of production. Physical capital transport costs are related to intercountry package delivery rates, and the cost of transporting embodied human capital is based on airfares between capital cities (the correlations with great circle distances are not perfect). In their analysis, for practical reasons, they confine their analysis to single distance metrics, but they prefer multiple distance measures. Taking the wider perspective of the industrial organization literature, distances may be in terms of trade openness space, regulatory space, commercial space, industrial structure space, or product characteristics space.

Formulation of a **W** matrix to reflect relative economic distance has been considered by, among others, Fingleton (2001, 2008), LeSage and Pace (2008), and Fingleton and Le Gallo (2008). For example, consider the unstandardized matrix

- (20)

in which *Q*_{h,0} is the level of output in economy *h* (at time 0) and *d*_{j,h} is a measure of geographical separation of locations *j* and *h*. There is no need to consider *Q*_{j,0} because it is nullified by the process of standardizing *W** by dividing by row totals to obtain **W**. In the context to which this applies, the use of start of period values for *Q* excludes feedback from other model variables, thus ensuring the exogeneity of **W**. The coefficients α and γ reflect the weight attributed to *Q*_{h,0} and *d*_{j,h}, with α= 0 corresponding to a pure distance effect, and γ= 0 corresponding to a pure economic size effect. These could be estimated alongside other model parameters, but because of the difficulty this would entail, it makes practical sense to assign values to these coefficients *a priori*.

Data are important in deciding **W**, and since the spatial interaction we are attempting to model using **W**, say as endogenous lag variable **WY**, is typically in economics a spillover or externality, we conventionally look at where and to what extent spillovers are occurring. Typically, in the case of knowledge spillovers, the main sources have been input–output tables, patent concordances, innovation concordances, and proximity analysis. However this is sheer information, and we need to move closer to a theory of network emergence, dynamics, and possible equilibrium conditions to have a more satisfying and coherent basis for **W**.

In order to obtain a closer representation of the spatial interaction process in **W** matrix construction choices, Anselin (2010) suggests greater focus on modeling agents involved in social and economic interaction. Looking back in this context, Patuelli et al. (2007) consider network interaction modeling with reference to earlier work on spatial interaction and discrete choice behavior such as Wilson (1967, 1970) a on entropy maximization and McFadden (1974, 1979) on the microeconomic basis of interaction models. Let us first consider **W** as a representation of a network involving nodes (people or places) and links between nodes. These can be seen as dynamic evolving entities, and we can envisage network development to be a response to costs and benefits in being a node or a link on the network. Consequently some networks might be dynamic and ephemeral, some networks in a stable equilibrium, and some network slowly evolving. Following Goyal (2009), we envisage that ephemeral and dynamic networks occur when there are payoffs. This leads to a theory of network formation, thus “A game of network formation specifies a set of players, the link formation actions available to each player and the payoffs to each player from the networks that arise out of individual linking decisions,” and “A network is said to be strategically stable or an equilibrium if there are no incentives for individual players (either acting alone or in groups) to form or delete links and thereby alter the network” (Goyal, 2009). A quasi-stable network is akin to what is normally envisaged in the regional science literature, where typically a network will be a fixed or very slowly evolving one, as a consequence of major investment in transport infrastructure, which define the internodal links. Assume that network formation and evolution is a consequence of decisions by network providers (investors in infrastructures) on the one hand, and network users on the other. As a thought experiment, let us consider how a **W** matrix might emerge and evolve. The network providers create, maintain, or develop the network according to the profit generated, where profit equals revenue minus cost. The revenue comes from the number of network users and the prices they pay to use the network. The cost depends on the extent of the network (miles of railway to maintain, e.g.) and may be divided between fixed costs and variable costs, which depend on network usage. The network users choose links on the network according to the level of utility they provide, with the choice of whether or not to use the network, and subsequently which network link to choose, modeled perhaps as a multilevel random utility model. With a poor network, which in a commuting sense might mean slow, unreliable and lengthy journeys, the level of utility will be low and users may prefer not to such use links. Consequently, usage and profits fall, although variable costs may also reduce. In such circumstances it seems that a poorly used network or link might fall into decay, although poor services may induce a reduction in prices, increase utility and usage, change profit levels, stimulate investment, and revive the network. Evidently users and providers are involved in a strategic game, with a potential for equilibrium outcomes, and with dynamic changes to networks and user behavior a possibility. The emerging literature on endogenous network dynamics involving dynamic stochastic games of network formation could provide many insights into how the structure of the **W** matrix can be placed on a more rational basis.

The potential for dynamic **W** matrices poses some problems for estimation, given the assertion that **W** is necessarily a fixed entity. While this may not be such an issue for cross-sectional approaches, where at a given snapshot in time this may be a reasonable approximation, with the extension of spatial econometrics to include panel data modeling it may be the case that **W** is evolving, interacting with the regression variables. Such an endogenous interaction is implied by Anselin (2010), who remarks that “an endogenous spatial weights matrix would jointly determine who interacts (and why) and how that interaction affects the rest of the model. Much progress remains to be made …” But we can have a dynamic **W** matrix as part of a simulation, with no consequence for estimation, as in Fingleton (2001).

### 4. DO WE NEED W?

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. W WITHIN SPATIAL ECONOMETRIC SPECIFICATIONS
- 3. PUTTING SOME ECONOMICS INTO W
- 4. DO WE NEED W?
- 5. CHOOSING W EMPIRICALLY
- 6. ALTERNATIVES TO W
- 7. HIERARCHICAL MODELS
- 8. MODELING SPATIAL EFFECTS THROUGH THE ERROR TERM
- 9. LINKING SPATIAL MODELS TO HIERARCHICAL MODELS WITH INTERACTION EFFECTS
- 10. CONCLUSIONS
- REFERENCES

Let us now imagine that we have network dependence but we choose to ignore it. Assume that we have explanatory variables comprising exogenous **X**'s and an endogenous spatial lag **WY**, so that the data-generating process is

- (21)

and that *X*_{1,i}= 1 for is a standardized *N* by *N* (Rook's case)^{5} contiguity matrix, **u**_{1} and **u**_{2} are *N* by 1 vectors sampling from an *N*(0, 1) distribution, and **I** is an *N* by *N* identity matrix. Vector is *N* by 1 sampling from an *N*(0, σ^{2} **I**) distribution. Also , and are nonsingular, ρ= 0.25, *b*_{1}= 1, *b*_{2}= 8, *b*_{3}= 2, σ^{2}= 1, and *N*= 121.

Given these data, we generate **Y** and estimate two models. One is the correctly specified model **Y =ρ WY + Xb +** estimated by maximum likelihood. The second is the (mis)specification **Y = Xb +ɛ** estimated by ordinary least squares (OLS), which incorrectly assumes ρ= 0. Clearly, given spatial dependence in **X**_{2} and **X**_{3}, the OLS **b** estimates will be biased, as is apparent from the 100 replications summarized in Table 1.

b | var(b) | ML | OLS | |||
---|---|---|---|---|---|---|

Mean | Mean | |||||

b_{1} | 1 | 0.0087 | 1.0121 | 0.1297 | 1.5522 | 5.9147 |

b_{2} | 8 | 0.0085 | 7.9966 | −0.0365 | 7.9995 | −0.0049 |

b_{3} | 2 | 0.0074 | 1.9948 | −0.0600 | 2.0908 | 1.0536 |

b_{1} | 1 | 0.0087 | 1.0019 | 0.0205 | 1.0729 | 0.7794 |

b_{2} | 8 | 0.0081 | 8.0146 | 0.1630 | 8.8903 | 9.9187 |

b_{3} | 2 | 0.0053 | 1.9948 | −0.0711 | 2.3745 | 5.1567 |

Given that we often need **W** to obtain unbiased estimates of **b**, we also need it to obtain an unbiased measure of the true effect of a variable, which typically is not the same as **b**, as emphasized by LeSage and Pace (2009). Given a SAR model of the form **Y**=ρ**WY**+**Xb**+, it is also the case that the interpretation of the effects on dependent variable **Y** of a unit change in an exogenous variable **X**_{j}, the derivative is not simply equal to the regression coefficient *b _{j}*. As pointed out by LeSage and Pace (2009), the true derivative also takes account of the spatial interdependencies and simultaneous feedback embodied in the model, leading to a total effect that differs somewhat (typically) from the regression coefficient estimate. It follows that

- (22)

in which **I** is the *N* by *N* identity matrix and (**I**−ρ**W)**^{−1} **I** *b _{j}* is an asymmetric

*N*by

*N*matrix, so the derivative varies according to the cells of

**X**

_{j}and

**Y**being considered. We can summarize these differentiated effects by their mean, which is

- (23)

in which is an *N* by 1 vector of 1s. This is the average total effect of a unit change in **X**_{j}. Also we can partition the average total effect of a unit change in all cells of **X**_{j} into a direct and an indirect component. The average direct effect of a unit change in *X*_{rj} on *Y*_{r} is given by the mean of the main diagonal of the matrix, hence

- (24)

This direct effect is somewhat different from *b _{j}* because it also allows for the fact that a change in

*X*

_{rj}affects

*Y*

_{r}, which then affects

*Y*

_{s}(

*s*≠

*r*) and so on, cascading through all areas and coming back to produce an additional effect on

*Y*

_{r}. The difference between the total effect and the direct effect is the average indirect effect of a variable. This is equal to the mean of the off-diagonal cells of the matrix (

**I**−ρ

**W)**

^{−1}

**I**

*b*, hence

_{j}- (25)

Table 2 gives the mean total effect of each of **X**_{2} and **X**_{3} for the small simulation with and with .

OLS | ML | Total | Direct | Indirect | |
---|---|---|---|---|---|

b_{1}= 1 | 1.5795 | 0.9892 | − | − | − |

b_{2}= 8 | 8.1616 | 8.0119 | 10.7411 | 8.1628 | 2.5784 |

b_{3}= 2 | 2.0379 | 2.0109 | 2.6962 | 2.0488 | 0.6474 |

ρ= 0.25 | − | 0.2535 | − | − | − |

b_{1}= 1 | 1.1855 | 0.9921 | − | − | − |

b_{2}= 8 | 9.1596 | 7.9923 | 10.6748 | 8.1391 | 2.5357 |

b_{3}= 2 | 2.3908 | 2.0036 | 2.6759 | 2.0404 | 0.6355 |

ρ= 0.25 | − | 0.2511 | − | − | − |

b_{1}= 1 | 2.3287 | 1.0079 | − | − | − |

b_{2}= 8 | 10.1432 | 8.0042 | 10.6645 | 8.1489 | 2.5156 |

b_{3}= 2 | 2.4856 | 1.9998 | 2.6645 | 2.0359 | 0.6286 |

ρ= 0.25 | − | 0.2494 | − | − | − |

Consider next what happens if the true data-generating process is

- (26)

and we (wrongly) fit the SAR model **Y**=ρ**WY**+**Xb**+. Will it be the case that the presence of **WY** biases the **b** estimates? This is a common criticism of spatial econometrics that the significance of the spatial lag is falsely interpreted as a true spatial spillover effect. Indeed too many spatial econometricians have been overenthusiastic in their adoption of the spatial lag without giving sufficient consideration to the theoretical rationale for the model specification. The consequences depend on the context. If the spatial lag **WY** is simply an additional, unnecessary term in the model, then typically ρ≈ 0, which is of no significance. If, however, the SAR specification excludes then the outcome depends on whether Most importantly, with the inference that ρ≠ 0 may be simply due to the fact that . On the other hand with this means that **X**_{3} is spatially random, and uncorrelated with the included variables, hence its absence does not affect the estimate obtained for ρ, in which case we should expect ρ≈ 0. Simply as an illustration, we generate data via **Y**=**Xb**+ reverting to and . With *b*_{1}= 1, *b*_{2}= 8, *b*_{3}= 2, σ^{2}= 1 and our two estimating equations give the following mean estimates.

Incorrectly, omitting **X**_{3} when it is spatially dependent as a result of setting has a significant biasing effect on the maximum likelihood (ML) estimates. In particular, the spatial lag is evidently picking up the effect of the spatially dependent omitted variable. With regard to the *t* ratios for ρ, the mean of 100 replications is 7.85. Likewise, the outcome is a biased estimate of *b*_{2}**.** Importantly, observe also that the total, direct, and indirect effects of a variable will be incorrect when the specification wrongly includes the endogenous spatial lag. For example, in Table 3, the true effect of **X**_{2} is given by *b*_{2}= 8.

OLS | ML | Total | Direct | Indirect | |
---|---|---|---|---|---|

b_{1}= 1 | 1.0041 | 1.1144 | − | − | − |

b_{2}= 8 | 8.0107 | 7.9017 | 8.0802 | 7.9035 | 0.1767 |

b_{3}= 2 | 2.0012 | − | − | − | − |

ρ= 0 | − | 0.0219 | − | − | − |

b_{1}= 1 | 0.9910 | 0.6611 | − | − | − |

b_{2}= 8 | 8.0193 | 6.1322 | 12.6192 | 6.6749 | 5.9443 |

b_{3}= 2 | 1.9968 | − | − | − | − |

ρ= 0 | − | 0.5137 | − | − | − |

However, it turns out that we just might be able to use **W** to mitigate the bias arising from the omission of spatially dependent regressors, for it has been shown by LeSage and Pace (2008) and Pace and LeSage (2008) that fitting the so-called spatial Durbin model,

- (27)

eliminates coefficient estimate bias, but this solution rests on the assumption that **W** is the correct one for the omitted variable SAR process. This is a topic that is explored and the analysis extended in Fingleton and Le Gallo (2009), who find via Monte Carlo simulation that when the omitted variable does not equate to a spatial autoregressive process in **W**, an augmented spatial Durbin specification, augmented to also include an autoregressive error dependence process, produces biased estimates, but ones that are less biased than those obtained by ignoring the existence of omitted variables.

While we often need **W**, sometimes its presence is unnecessary and can be misleading. A second note of caution also suggesting moderation of the emphasis by LeSage and Pace (2009) on **W** leading to total, direct and indirect effects, as the proper interpretation of the impact of exogenous variables in the presence of a spatial lag, comes from the specification

- (28)

Following equation (22), , but equation (22) does not apply with regard to **X**_{1}, and instead . This type of specification has been suggested by Fingleton (2003), Fingleton (2006), and Barde (2010) as the reduced form resulting from the existence of an ancillary SAR process.

In the following example, log labor efficiency (ln **A**) is assumed to depend on local exogenous variables embodied in the *N* by *k* matrix **X**, on log labor efficiency in ‘nearby’ areas (**W**ln**A**), and on random disturbances (), hence ln **A = Xb +**ρ**W**ln**A +ξ**, ∼*N* (0, ^{2}). It is convenient to specify this with the exogenous variables on the right-hand side, hence ln **A =** (**I**−ρ**W)**^{−1} (**Xb +ξ**). Starting from an explicit economic theory with microfoundations, they assume wages **w** depend on employment density **E** and labor efficiency **A** in each area *j*, *j*= 1 , … , *N*, thus ln *w _{j}*=

*k*

_{1}+ (γ− 1) ln

*E*+ (γ− 1) ln

_{j}*A*. Substituting and rearranging obtains

_{j}- (29)

It is apparent that the partial derivative is simply equal to (γ− 1), so despite the existence of the spatial lag **W**ln**w** in our model, there is no need to interpret the effect of this variable any differently from the normal interpretation.

This then leads us to the problem of inference and interpretation in a spatial econometric model that is driven by underlying economic theory, as is the case in Fingleton (2003, 2006), and Barde (2010), compared with the inference and interpretation one would associate with a model specification that is governed entirely by empirical analysis. It is apparent that we could obtain misleading results if empirical analysis suggests a model specification that is contrary to the true specification.

Consider the following simple example, in which the data generating process (DGP) is the above model, but instead the following spatial Durbin specification is fitted to the data

- (30)

From this starting point, our best-fitting model would probably be a constrained version of this specification, but without knowledge of the underlying theory driving the DGP we would never consider the true specification among the set of optional models, and come to a false interpretation of the effects of the variables.

To illustrate this, consider the DGP based on an 11 by 11 lattice giving *N*= 121 observations of variables **w**, **E**, and **X**, with the 121 by 121 **W** matrix comprising a matrix of 1s and 0s according to the Rook's contiguity criterion, subsequently standardize to row totals of 1. The values of **E** and **X** are, respectively, *N* by 1 and *N* by 2 matrices of pseudorandom numbers drawn from the standard uniform distribution on the open interval (0, 1) excluding 0s. Given *k*_{1}= 1, γ= 1.25, ρ= 0.15, *c*_{1}= 8, *c*_{2}= 2, and (γ− 1)^{2} ^{2}= (γ− 1)^{2}, we generate ln **w** via

- (31)

Typical outcomes of this DGP and model-fitting exercise are given in Table 4. We also eliminate the insignificant spatial lags (Model B).

Model A | z | Model B | z | |||
---|---|---|---|---|---|---|

Coefficient | t Asymptotic | Coefficient | t Asymptotic | |||

b_{0} | 1.689 | 5.670 | 0.000 | 1.817 | 11.163 | 0.000 |

b_{1}(lnE) | 0.245 | 7.614 | 0.000 | 0.240 | 7.665 | 0.000 |

b_{2}(WlnE) | −0.022 | −0.326 | 0.743 | − | − | − |

c_{1}(X_{1}) | 7.993 | 633.3 | 0.000 | 7.999 | 846.2 | 0.000 |

c_{2}(X_{2}) | 2.005 | 221.2 | 0.000 | 2.006 | 236.1 | 0.000 |

d_{1}(WX_{1}) | −0.153 | −0.709 | 0.477 | − | − | − |

d_{2}(WX_{2}) | −0.036 | −0.638 | 0.522 | − | − | − |

ρ (Wlnw) | 0.167 | 6.334 | 0.000 | 0.149 | 67.81 | 0.000 |

R^{2} | 0.999 | 0.999 | ||||

0.999 | 0.999 | |||||

σ^{2} | 0.076 | 0.077 | ||||

N | 121 | 121 | ||||

ll | 25.14 | 24.88 |

These model estimates closely approximate those assumed in the DGP, but they give a misleading indication of the true effect of the variables. Using the equations given above, we would infer that the average total effect of ln**E** is *N*^{−1}′ (**I**−ρ**W)**^{−1} **I***b*_{1}= 0.3387 compared to the true value of 0.25.

### 5. CHOOSING W EMPIRICALLY

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. W WITHIN SPATIAL ECONOMETRIC SPECIFICATIONS
- 3. PUTTING SOME ECONOMICS INTO W
- 4. DO WE NEED W?
- 5. CHOOSING W EMPIRICALLY
- 6. ALTERNATIVES TO W
- 7. HIERARCHICAL MODELS
- 8. MODELING SPATIAL EFFECTS THROUGH THE ERROR TERM
- 9. LINKING SPATIAL MODELS TO HIERARCHICAL MODELS WITH INTERACTION EFFECTS
- 10. CONCLUSIONS
- REFERENCES

Consider next the question of which matrix **W** should be chosen given the obvious scope for numerous competing weights matrices. While theory will in the best practice cases drive the structure of **W**, it nevertheless is true that there are a number of degrees of freedom in the exact **W** specification, for example, does one adopt a negative exponential decay or a power function for the distance term in equation (20). Harris, Moffat, and Kravtsova (2010) review some alternative approaches to constructing **W**, such as trawling through data, perhaps designing **W** around the residuals from a first stage regression, but this is atheoretical.

Burridge and Fingleton (2010) set out the history of the problem, commencing with Anselin (1986) who considers the simple model,

- (32)

where the *N*×*N* weight matrix, **W**, has three different forms, **W**_{A}, **W**_{B}, and **W**_{C}. Taking **W**_{A} to be the null hypothesis, Anselin considers *J*-type statistics in order to discriminate between these alternatives, obtained by augmenting (32) by additional explanatory variables equal to the fitted values from the model with weights **W**_{B} or from the model with weights **W**_{C}.

To further illustrate this, we follow Kelejian (2008) and Burridge and Fingleton (2010) and consider the more elaborate SARAR model, in which the choice of the **W** matrix is accompanied by the question of which matrix **M** to adopt, the latter defining the autoregressive spatial error dependence in the SARAR(1,1) model, thus

- (33)

Here, the *N*×*k*_{0} matrix of exogenous variables, **X**_{0}, and the *N*× 1 vector for the dependent variable, **Y,** are each measured without error, the two *N*×*N* weight matrices, **W**_{0} and **M**_{0} are fixed *a priori*, and the unobserved shock vector, **v**_{0}∼*iid*(0, σ^{2}_{0} **I**_{N}) is independent of the exogenous regressors, **X**_{0}. The parameters to be estimated are the slope coefficients, **b**_{0}, the spatial lag and error coefficients, ρ_{0} and λ_{0}, and the variance, σ^{2}_{0}. The suffix 0 denotes that this specification is one of (at least) two competing non-nested hypotheses. Under the alternative, the data are generated by a similar structure, hence

- (34)

Kelejian (2008) considers the tests of these competing models, extending the problem by allowing >2 non-nested alternatives. Among the hypotheses that can be tested, Burridge and Fingleton (2010) assume the explanatory variables, **X**_{0} and **X**_{1} are the same in the two models, but the spatial structures differ, so that **W**_{1}≠**W**_{0} and **M**_{1}≠**M**_{0}, but for simplicity they set **M**_{0}=**W**_{0} and **M**_{1}=**W**_{1}≠**W**_{0}.

One alternative to the *J*-type statistics is to use an information criterion, thus avoiding several model comparisons, as suggested by Leenders (2002). However “… unfortunately different information criteria will in general lead to the selection of different models, so that the uniqueness of the chosen model relies on the investigator first selecting which criterion to adopt” (Burridge and Fingleton, 2010).

Related approaches use Bayesian model averaging (LeSage and Parent, 2007; LeSage and Fischer, 2008). While parameter uncertainty is well known, model uncertainty involving the unknown true structure of the **W** matrix is less well explored. However, finding the true **W** may involve searching through a very large number of competing specifications, which may include the true specification, rather than being decided on theoretical grounds.

### 6. ALTERNATIVES TO W

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. W WITHIN SPATIAL ECONOMETRIC SPECIFICATIONS
- 3. PUTTING SOME ECONOMICS INTO W
- 4. DO WE NEED W?
- 5. CHOOSING W EMPIRICALLY
- 6. ALTERNATIVES TO W
- 7. HIERARCHICAL MODELS
- 8. MODELING SPATIAL EFFECTS THROUGH THE ERROR TERM
- 9. LINKING SPATIAL MODELS TO HIERARCHICAL MODELS WITH INTERACTION EFFECTS
- 10. CONCLUSIONS
- REFERENCES

It is evident that the specification of **W** is fundamental and that for some the problems this poses, either from the perspective of logic, theory, or empirics, poses a significant hurdle (McMillen, 2010a). Some have advocated rejecting the traditional *a priori* fixed **W** matrix approach in favor of potentially less problematic ways of introducing spatial interaction in spatial econometric models (Folmer and Oud, 2008; Harris et al., 2010). Thus instead of constructing **W**, they suggest directly entering variables in the regression model that proxy spillovers (Harris et al., 2010). This is the approach adopted by Paci and Usai (2009). As Harris et al. (2010) observe, “the essential difference between this and the standard approach using **W** is that spillovers are not entered through the interaction between regions of the dependent or other (state) variables in the model, weighted by **W**, but rather through constructing ‘stand alone’ proxies for spatial spillovers.” However, we argue that such an approach itself requires strong identifying assumptions and therefore possesses no real advantage compared to employing a **W** matrix. In other words, this type of approach also typically involves some form of *a priori* variable selection and weighting usually in the form of a geographical proximity measure, and so does not really represent a complete departure from the **W** matrix approach, and even if weighting can be avoided it introduces additional complexities of (arbitrary) variable definition.

With time series and panel data, we have more scope for a more refined approach. Seemingly unrelated regression (SUR) probably provides the most complete break from the **W** matrix approach, because it estimates the error variance–covariance matrix on the basis of location-specific time series, thus, according to Anselin (1988), “the spatial dependence is not expressed in terms of a particular parameterized function, but left unspecified as a general covariance,” and “in spatial econometrics, this model has been suggested as an alternative to the use of spatial weights.” Other time series related alternatives in the form of vector autoregressions, which attempt to pick up spatial interaction via the presence of (any number of) lagged variables from “neighboring” regions are an interesting alternative, but with a large number of regions ultimately collapse under the weight of a large number of parameters to estimate and interpret. However, these problems may not always be fatal. One way out of this problem is via the Bayesian approach of LeSage and Krivelyova (1999), and Chang and Coulson (2001) successfully use a structural vector autoregression to model spatial spillovers. Another approach that has been advocated (Kelejian and Prucha, 2007) is spatial non-parametric heteroscedasticity and autocorrelation consistent estimation (SHAC). This gives consistent estimates of the error covariance matrix under rather general assumptions, allowing various patterns of correlation and heteroscedasticity, including a spatial ARMA(*p,q*) error process (hence with *p* autoregressive parameters and *q* moving average parameters). Kelejian and Prucha (2007) assume that the disturbance vector **e** is

where **R** is a nonstochastic matrix with unknown elements. The asymptotic distribution of IV estimators implies that the variance–covariance matrix is

in which ={σ_{ij}} is the variance–covariance matrix of **e** and is a full column rank matrix of instruments. The (*r*, *s*)th element is

where is the IV residual for observation *i*, *d _{ij}* is the distance between locations

*i*and

*j*,

*d*is the bandwidth and

_{n}*K*(·) is a kernel function. Among the alternatives, we might opt for the Parzen kernel as given by Andrews (1991),

From this, it is evident that this approach is not assumption free. Alternative kernels, such as the Bartlett, Tukey-Hanning, and Quadratic spectral kernels, each put different weights on the lagged covariances. Additionally, different bandwidth or lag truncation parameter options exist. This is also the case of semiparametric approach proposed by McMillen (2010b) in this volume, which uses a Cubic kernel transformation of the covariate, **X**, as a function of geographical distance, **f**(**X(d**)), in a relationship that suffers from missing variables and incorrect functional form. Smoothing over space adds a variable that is correlated with the omitted variable, which is also correlated with space, and so adds significant explanatory power to the model. In addition, the use of the nonlinear spline function resolves the functional form misspecification. In practice these choices are essentially data driven, and the choices made drive the performance of both (S)HAC and semiparametric estimators. Evidently SHAC and semiparametric estimators provide no obvious advantage to the flexibility inherent in a data driven, potentially asymmetric, **W** matrix approach, which of course may be applied (perhaps using different **W**'s) not only to the error process, but also to endogenous and exogenous variables. Moreover, rather than neutralizing spatial dependence as a nuisance phenomenon, where spatial econometrics take a lead is in its ability to identify and test theory relating to explicit spatial dependence mechanisms, as embodied in the parameterization of a **W** matrix. Of course this demands a specific functional form, and values assigned to parameters, but careful data analysis may help to identify the most appropriate specification, in ways that parallel how optimal HAC estimators are obtained.

It is apparent that while alternatives have been advocated, which attempt to model spatial dependence, ultimately their application also calls for some simplifying and operational assumptions, so that they do not, except for SUR modeling, represent a complete break from what is also required by the traditional **W** matrix approach. However, there is one outstanding approach to modeling spatial effects, which has had very little attention in the spatial econometrics literature, namely HLM. At first glance, this seems also to be a way to allow spatial effects without having to resort to a **W** matrix as a convenient and practicable option. However, as we demonstrate for the first time, the **W** matrix is also embodied within this approach.

### 7. HIERARCHICAL MODELS

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. W WITHIN SPATIAL ECONOMETRIC SPECIFICATIONS
- 3. PUTTING SOME ECONOMICS INTO W
- 4. DO WE NEED W?
- 5. CHOOSING W EMPIRICALLY
- 6. ALTERNATIVES TO W
- 7. HIERARCHICAL MODELS
- 8. MODELING SPATIAL EFFECTS THROUGH THE ERROR TERM
- 9. LINKING SPATIAL MODELS TO HIERARCHICAL MODELS WITH INTERACTION EFFECTS
- 10. CONCLUSIONS
- REFERENCES

Hierarchical models are becoming increasingly popular across the range of the social sciences, as researchers come to appreciate that observed outcomes depend on variables organized in a nested hierarchy. We see many applications of multilevel modeling in educational research where there exist a number of well-defined groups organized within a hierarchical structure. In economic geography, with a hierarchy of local, regional, and national effects typically influencing outcomes, the obvious starting point is multilevel modeling, in which individual level cross-sectional (spatial) data within the same local administrative area, for example, are subject to an effect because of their common location. Perhaps local property taxes are different across local administrative units, and properties, which are the units of observation, have prices partly reflecting these local tax differences. Additional spatial effects may arise at different levels of a nested hierarchy; for instance we may wish also to control for the effects of being located within the same region, perhaps because policy instruments having an effect on property prices are applied at the regional level and are different from the effects of local tax differentiation.

Recognition of the different forms of interactions between variables that affect each individual unit of the system and the groups they belong to has important empirical implications. In fact, regardless of spatial autocorrelation, the assumption of independence is usually incorrect when data are drawn from a population with a grouped structure since this adds a common element to otherwise independent errors, thereby inducing correlated within-group errors. Moulton (1986) finds that it is usually necessary to account for the grouping either in the error term or in the specification of the regressors. Apart from within-group errors, it is also possible that errors between groups will be correlated. For example, if the groups are geographical regions then regions that are neighbors might display greater similarity than regions that are distant. Again, Moulton (1990) shows that even with a small level of correlation, the use of OLS, will lead to standard errors with downward bias and to erroneous conclusions of statistical significance.

One way of incorporating the group effect in a multilevel framework is to evaluate the impact of higher level variables that measure one or more aspects of the composition of the group to which an individual belongs. Bryk and Raudenbush (1992) consider different ways of doing this, such as using a simple mean covariate over the higher level units as an explanatory variable. The mean covariate characterizes group effects that are measurable and in this respect differs advantageously from the use of dummy variables that capture the net effect of several omitted variables. Note that it is possible that having controlled for these measurable compositional effects there are still unobservable spatial effects.

Such correlated unobservables can be modeled either as fixed or random effects. If we have data grouped by geographic area with all the areas represented in the sample then a fixed effects specification is appropriate. When only some of the areas are represented in the sample or there is a pattern of dependence involving unknown spatial effects, we might opt for random effects in a hierarchical model operating through the error term. This is achieved by way of an unrestricted nondiagonal covariance matrix. As with unconditional analysis of variance (ANOVA), this will provide the decomposition of the variance for the random effect into an individual component and a group component. Under a spatial dependence process acting at the level of random group effects, the random components are typically affected by those of neighboring groups. This assumption is usually a relaxation of the main hypothesis in HLM, that is, independence between groups. As we have seen, especially when the groups are geographical areas, this might often be unrealistic.

Figure 4 is a diagrammatic representation of a hierarchical structure, with *r* denoting the top level, *g* the second level, and *I* the individual level. There is a varying number of individuals per second-level group, and varying numbers of second-level groups in each category at the top level of the hierarchy. In the context of spatial data, we might consider a geographical grouping of individuals with the highest level being regions (*r*) each of which nests smaller geographical subregions (*g*). These subregions may be either specific areas of residence or some other relevant geographical units. Located within each subregion, there are individuals (*I*) with a varying number of individuals per subregion. We associate with each individual, a response *Y _{i}* that is dependent upon a set of covariates

*X*. However, in assessing whether we might assign any causal relationships between one or more covariates in

_{i}*X*and the individual response

_{i}*Y*, it is necessary to consider the hierarchical structure of the data, and in particular within- and between-group effects.

_{i}There are a number of advantages in taking a multilevel approach. First, in standard unilevel OLS estimation the presence of nested groups of observations may be dealt with by the use of dummy variables. However, a large number of levels result in a dramatic reduction in degrees of freedom. Second, a multilevel approach helps us to analyze the effect of heterogenous groups in the small sample situation. In fact with unbalanced data, while OLS estimates of the coefficients give equal weights to each cluster, the preferred model acknowledges the fact that estimates for the fixed coefficients can change according to the cluster size. It is therefore possible to adjust both the estimates and the inference according to the precision associated with each group, which is determined by the number of individuals in each group (this is technically referred to as shrinkage). In most applications, shrinkage is desirable so that clusters that provide little information have little influence in estimation. The hierarchical multilevel method, which is sample size dependent, seems to have a distinct edge over other methods in eliminating bias.

Compared to other approaches such as clustered-standard-error (CSE) OLS, HLM has some advantages: first, while CSE techniques treat the random variation as a simple nuisance, the objective of HLM is to estimate and decompose the total random variation in an individual component and a group component. Second, while CSE only adjusts standard errors for nonindependence, HLM provides us with estimates of the variance components at each level and these affect point estimates also. In turn, variances and covariances constitute valuable information on the contribution of nonobservable factors at each level to the variation of the dependent variable (Aslam and Corrado, 2007).

### 8. MODELING SPATIAL EFFECTS THROUGH THE ERROR TERM

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. W WITHIN SPATIAL ECONOMETRIC SPECIFICATIONS
- 3. PUTTING SOME ECONOMICS INTO W
- 4. DO WE NEED W?
- 5. CHOOSING W EMPIRICALLY
- 6. ALTERNATIVES TO W
- 7. HIERARCHICAL MODELS
- 8. MODELING SPATIAL EFFECTS THROUGH THE ERROR TERM
- 9. LINKING SPATIAL MODELS TO HIERARCHICAL MODELS WITH INTERACTION EFFECTS
- 10. CONCLUSIONS
- REFERENCES

In the random coefficient model, the level of the individual response varies according to location. For example, individuals' income levels, controlling for individual level covariates (**X**) such as educational attainment, might vary if they reside in different areas. Part of the reason could be the effect of, say, fixed level 2 contextual factors (**Z**), and partly because of level 2 specific random effects {*u _{j}*}. However these fixed and random effects, while jointly accounting for heterogeneity across residential areas, are not spatially correlated, a topic we address subsequently. With this in mind, the specification of a multilevel model is

- (35)

where **Z**={*Z _{ij}*} is a set of contextual factors at level 2,

**Y**={

*Y*},

_{ij}**X**={

*X*} and

_{ij}**={**

*ɛ**e*}+{

_{ij}*u*}. The dimension of

_{j}**Y**,

**X**, and

**Z**are (

*N*× 1), (

*N*×

*k*), and (

*N*×

*q*), respectively. The vectors

_{0},

_{1}, and denote the vectors of fixed effect coefficients. The additive error term is composed of an idiosyncratic random error term

*e*for the

_{ij}*i*th unit belonging to level

*j*and a random effect

*u*accounting for some level 2 heterogeneity. We make the following assumptions:

_{j}- (36)

If we let σ^{2}_{e} (σ^{2}_{u}) denote the variance of *e _{ij}*(

*u*) such that for

_{j}*cov*(

*e*,

_{ij}*u*) = 0, then σ

_{j}^{2}

_{ɛ}=σ

^{2}

_{e}+σ

^{2}

_{u}represents the sum, respectively, of the within- and between-group variances. Based upon the above, the (equicorrelated) intraclass correlation is

- (37)

This correlation measures the proportion of the variance explained at the group level. In single-level models, σ^{2}_{u}= 0 and σ^{2}_{ɛ}=σ^{2}_{e} become the standard single level residual variances.

In order to accommodate spatial effects operating via the error term, we can rewrite the composite error term as

- (38)

where **W**={*W _{jh}*} allows us to specify the way neighboring areas affect

*u*. The matrix

_{j}**W**is a matrix of distances between the

*G*entities as discussed below. The intraclass correlation is now given by

- (39)

Trivially if **W**=**I**,where **I** is the identity matrix, then we have the standard random effects model ignoring any between-group effect.

If *u _{j}* are treated as fixed effects then we need to assume that

*cov*(

*e*,

_{ij}*u*) =σ

_{j}_{eu}= 0, that is transient individual-level random effects are uncorrelated with, say, a level 2 variable such as the area of residence. If

*u*and

_{j}*e*are not independent, the generalized least squares (GLS) estimator would be biased and inconsistent. If

_{ij}*u*are random effects, we also assume independence between these and the covariates such that

_{j}*cov*(

*X*,

_{ij}*u*) =σ

_{j}_{ux}= 0 (Blundell and Windmeijer, 1997).

We first rewrite (35) in compact form as

- (40)

where **J** is (*N*× (*k*+*q*+ 1)) and is ((*k*+*q*+ 1) × 1) and is the design matrix of the random parameters that is used in the estimation to derive the estimates for and The hierarchical two-stage method for estimating the fixed and random parameters (the variance and covariances of the random coefficients) originally proposed by Goldstein (1986), is based upon an iterative least squares (IGLS) method that results in consistent and asymptotically efficient estimates of .^{6}

As Goldstein (1989) has stressed, the IGLS used in the context of random multilevel modeling is equivalent to a maximum-likelihood method under multivariate normality, which in turn may lead to biased estimates. To produce unbiased estimates, a restricted iterative generalized least squares (RIGLS) method may be used which, after the convergence is achieved, turns out to be equivalent to a restricted maximum likelihood estimate (REML). One advantage of the latter method is that, in contrast to IGLS, estimates of the variance components via RIGLS/REML take into account the loss of the degrees of freedom resulting from the estimation of the regression parameters. Hence, while the IGLS estimates for the variance components have a downward bias, the RIGLS/REML estimates don't.

### 9. LINKING SPATIAL MODELS TO HIERARCHICAL MODELS WITH INTERACTION EFFECTS

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. W WITHIN SPATIAL ECONOMETRIC SPECIFICATIONS
- 3. PUTTING SOME ECONOMICS INTO W
- 4. DO WE NEED W?
- 5. CHOOSING W EMPIRICALLY
- 6. ALTERNATIVES TO W
- 7. HIERARCHICAL MODELS
- 8. MODELING SPATIAL EFFECTS THROUGH THE ERROR TERM
- 9. LINKING SPATIAL MODELS TO HIERARCHICAL MODELS WITH INTERACTION EFFECTS
- 10. CONCLUSIONS
- REFERENCES

With the emergence of interaction-based models (Akerlof, 1997; Manski, 2000; Brock and Durlauf, 2001), research has gradually moved from a pure spatial definition of neighborhood toward a multidimensional measure based on different forms of social distance and spillovers (Anselin, 1999; Anselin and Cho, 2002). In this setting, multilevel models with group effects are generally defined as economic environments where the payoff function of a given agent takes as direct arguments the choice of other agents (Brock and Durlauf, 2001). A typical example is the emergence of social networks where it is often observed that people belonging to the same group tend to behave similarly (Manski, 2000). The propensity that a person behaves in a certain way varies positively with the dominant behavior in the group (Kandori, 1992; Bernheim, 1994).^{7}

We consider how multilevel models can be linked to spatial models and generalize them to incorporate more general forms of network dependence involving individuals belonging to the same group. We start by assuming a specific form of spatial dependence where the dependent variable, **Y**, depends on its spatial lag as in traditional SAR models

- (41)

where **W** is an *N*×*N* matrix with *G* groups/areas each containing *w _{j}* individuals so that and

**Z**is a (

*N*×

*q*) matrix of contextual variables defining group characteristics. Let's start by assuming that

**W**is a block matrix

- (42)

where is the *w _{j}*-dimensional column vector of ones. Elements in matrix

**W**indicate that individuals within a group are affected by the (average) behavior of other individuals residing in the same location, in other words by other members of the same group. Figure 5 depicts a situation resembling a complete network, as described in Section 3, where each unit

*I*= 1 , … ,

_{ij}i*w*= 1 , … ,

_{j}j*G*interacts in the same way with all the other units in the group.

We can envisage many different forms of interactions among members of the same groups. Figure 6 illustrates a situation where individuals in the same group are affected by individual *I*_{3} without influencing each other. We can clearly generalize this analysis to other forms of network interactions, such as the nearest neighbor scenario introduced in Section 3 and described by the weight matrix **W**^{C}, in which case the block diagonal matrix is

Finally, we could also introduce different types of network structures across groups, further generalizing the structure of the block diagonal matrix **W**^{C}.

To illustrate the implication of group dependence for spatial models, we focus on the hierarchical complete network structure shown in Figure 5. The specification (41) together with the assumption of group dependence implied by (42) allows us to examine the particular case of a hierarchical model where, for ease of exposition, we assume for the moment that individuals (firms etc.) within the same group are affected in the same way by all the other members of the group.

Focusing on within-group effects, we therefore assume that so that each individual within group *j* has the same weight. This means that, somewhat differently from conventional spatial econometrics, the interindividual interactions do not spill across group boundaries, and within groups no account is taken of differential location leading to different weights according to distance between individuals. We will consider the implication of group spillovers in a hierarchical setting in the following section and show how introducing such externalities between groups is crucial for the identification of network interaction effects.

With this assumption, we can rewrite (41) as

- (43)

Following Manski (1993), β_{1} gives the effect of individual-level characteristics *X _{ij}*, ρ

_{1}captures the strength of endogenous group effects quantifies the exogenous or contextual effect

*Z*,

_{j}*u*are random group effects, and

_{j}*e*is an individual specific random component capturing other unmodeled sources of variation in

_{ij}*Y*. The problem here is the identification of the parameters in the presence of what Manski calls reflection, something that we address below. Of course, identification only becomes a problem in linear-in-means models, and any nonlinearity, for instance as in an expanded spatial Durbin model (see Gibbons and Overman, 2010), automatically solves the problem.

_{ij}As an example, consider workers within companies, with wages *Y _{ij}* dependent on individual worker attributes,

*X*, and on company-level contextual effects

_{ij}*Z*(such as sector, company wages policy, level of research, and development activity, investment etc.). In addition, other unmeasured causes of individual wage variation are represented by random group (company) effects

_{j}*u*and individual random effects

_{j}*e*, and with ρ

_{ij}_{1}≠ 0 wages may also be endogenously determined so that a higher wage level achieved by one worker spills over (via to other workers in the firm.

Taking group means of both sides of (43) and solving for (assuming ρ_{1}≠ 1) results in the between-group regression

- (44)

where .^{8} If (reflection), putting together (44) in (43) and centering, we obtain

- (45)

with *u*″_{j}=*u _{j}*+ρ

_{1}

*u*′

_{j}.

^{9}This is an example of Manski's reflection problem: in situations where group average characteristics, , directly affect the individual outcome,

*Y*, the parameters in the structural model are not identified. In this case, the number of coefficients in the reduced form (45) is not sufficient to identify the coefficients in the structural equation (43). We have four parameters to identify in the structural equation (43) but only three estimated coefficients in (45). Note that without group effects (ρ

_{ij}_{1}=γ= 0), the reduced form simplifies to the basic one-way error component model

*Y*=β

_{ij}_{0}+β

_{1}

*X*+

_{ij}*u*+

_{j}*e*. In other words, group effects generate an excess between-group variance by introducing mean peer characteristics, as an effect on outcomes.

_{ij}^{10}

In the following section, we will show how the connection between spatial models and multilevel models with group interactions (ρ_{1}≠ 0, γ≠ 0), allows parameter identification using simple instruments for the endogenous effects, .

#### Spatial Effects and Identification

Extending the work by Cohen-Cole (2006) to the area of modeling interaction effects in a multilevel setting (see Corrado, 2009), we can rewrite (43) in a way that takes into account possible interdependencies not only within groups but also across groups, where a group may be those living in a specific district or region. We assume intragroup effects , where is an average of all *w _{j}* unit responses within group

*j*and, importantly we now introduce out-group effects , which are equal to an average of the responses across all ‘neighboring’ groups, except group

*j*. Typically, we might consider to be based only on those groups that are spatially or socially proximate. Note that thus far the literature has assumed that all surrounding groups enter with the same weight, but in this paper we introduce the innovation of differential weighting. Similarly we introduce out-group contextual effects thus leading to the model

- (46)

in which the outcome of individual *i* in group *j*, *Y _{ij}*, depends on the average outcome of group

*j*, where the implicit assumption is that unit

*i*is affected equally by all other units in

*j*.

We also assume that the individual outcome depends on the average outcome, and average contextual effects, of other groups “surrounding” group *j*. We represent the endogenous out-groups spillover variable in deviation form, as the mean within the group minus the mean in regions ‘nearby,’ hence the variable is . Likewise, the contextual spillover variable is specified as In order to identify all the relevant parameters in the model, we consider the average relationship derived from (46)

- (47)

where .^{11}

Putting together (47) in (46) and assuming reflection so that ,

- (48)

where *u*″_{j}=*u _{j}*+ (ρ

_{1}+ρ

_{2})

*u*′

_{j}. It is clear from (48) that we can identify all the parameters in the structural equation (46) if the number of level-1 units (typically number of individual people) exceeds the number of groups and , that is, if for some

*j*≠

*l*agents in one group are affected by the value of

*Y*or by contextual effects

*Z*in other surrounding groups. Hence, one could use outer groups' behavior as instruments to identify the endogenous effects, For example, in the wage equation one could use as instruments wage and company characteristics in neighboring companies, in other sectors, or at different geographical/hierarchical levels.

One important issue is collinearity, but the collinearity between *X _{ij}* and can be nullified by centering the variables. Reparameterizing, we obtain an equivalent hierarchical model

^{12}:

- (49)

So, we find that if we have an HLM model specification that includes out-group effects and , then this facilitates identification (c.f. Manski, 1993) of the model parameters in equation (46) because we can use and as instruments for the endogenous variable . In doing this we assume that our instruments are not weak, and are orthogonal to **Y**.

#### Estimating Group Spillovers in Hierarchical Models

The specifications developed thus far in fact embody specific **W** matrix assumptions (as described below). Given these, we could envisage a generalization of (49) in order to incorporate more general forms of group interaction effects

- (50)

which resembles the spatial Durbin specification (30) in Section 4, where **W** is the matrix that defines the within-group effects and **W**_{l} the out-group effects. Hence, when estimating the simpler specification where **W**_{l} **Z**=**W**_{l} **Y**= 0 not only we are unable to identify both endogenous and exogenous interaction effects, we also face an omitted variable problem. In this case, unobserved (omitted) covariates would be correlated with the observed (included) covariates. Since the observed covariates also include the endogenous spatial lag parameter ρ_{2}, its significance would be misleading, since it would pick up the effects of the omitted variables **W**_{l} **Z** and **W**_{l} **Y**. This occurrence is particularly problematic when estimating models with random effects since the maintained assumptions are that both the individual and the group error components should be uncorrelated with the regressors *Cov*(**u**, **X**) = 0 and *Cov*(**e**, **X**) = 0. This assumption is of course the same as for random effects panel models, as would typically be tested by means of a Hausman test of parameter consistency. This involves comparison of fixed and random effects estimates, assuming the fixed effects are consistent by being independent of the idiosyncratic disturbances. In the hierarchical case, omitting spatial effects for **X** and **Z** will likewise create inconsistent estimators since the error components will be correlated with the regressors.

While we do not have a test that is equivalent to the Hausman test, we do have a way of avoiding inconsistency resulting from omitted variables. This is by means of the inclusion of spatial effects **W**_{l} **Z** and **W**_{l} **Y**. In fact these play a dual role: (i) of avoiding the omitted variable problem that may afflict models with endogenous spatial lags and (ii) of introducing a source of exogenous variation that helps to identify both endogenous and exogenous group effects, which seems to be vital to most of the empirical work on network interactions. In the context of model (50), estimation of both the observed and unobserved components is achieved via RIGLS/REML, as mentioned in Section 9, given **W**_{l} **Y,** **W**_{l} **Z** and **WX**.

We now consider a simulation exercise. The basis of this is the group set-up. Groups can be unequal in size, and the weights do not have to be equal as in equation (42), so that group members can carry differential weights, for example, for a four-member group

- (51)

in which *a*+*b*+*c*+*d*= 1. Thus, **W**_{1} can have a different number of members and different weights (also summing to 1 across rows), compared with **W**_{2} , … , **W**_{G} where *G* is the number of groups.

This leads to our square diagonal **W** matrix, which is

in which 0 stands for a submatrix of 0s of appropriate dimension.

The spatial lagging matrix **W**_{l} identifies the location and strength of spillovers between groups, for example,

Given this, we start by considering a generalization of the structural equation (46)

- (52)

We can therefore simulate our data from the following process for **Y**^{13}

- (53)

We assume that we have eight groups composed of six members each so that *G*= 8, *N*= 6, and the matrices of interactions within each group are assumed to take the form of (51) with values for row totals summing to 1. Given the matrices **W**_{1}, **W**_{2} , … , **W**_{8}, we therefore define the block matrices of within- and between-group interactions **W** and **W**_{l}. The value of **Y** is assumed to depend on an exogenous variable **X**, the contextual variable **Z,** and the random disturbances **e**∼*N*(0, Ω^{2}_{e}), and **u**∼*N*(0, Ω^{2}_{u}) while **I** is an identity matrix of dimension *N*. The values of the *N* by 1 vectors **Z** and **X** are set exogenously; **e** and **u** are *N* by 1 vectors of normally distributed pseudorandom numbers for the individual and the group random effects that are independent of the exogenous regressors. Given β_{0}= 0.5, β_{1}= 0.6, γ= 0.5, ρ_{1}= 0.5, and , we generate **Y** via (54). We ensure throughout that (**I**− (ρ_{1}+ρ_{2}) **W +**ρ_{2}**W**_{l}) is nonsingular.^{14}

Having generated **Y** using (54), we estimate two models. One is the correctly specified model (52), denoted model A in Table 5, and the second, model B, is the misspecified model that omits the spatially dependent variable **W**_{l} **Y,** since ρ_{2}= 0, and the out-group effect, **W**_{l} **Z**. Thus,

- (55)

(RIGLS/REML) | ||||
---|---|---|---|---|

Model A | Model B | |||

Coefficient | S.E. | Coefficient | S.E. | |

β_{0}= 0.5 | −0.045 | 0.275 | −6.028 | 0.146 |

β_{1}= 0.6 | 0.589 | 0.007 | 0.589 | 0.000 |

γ= 0.5 | 0.479 | 0.011 | −0.114 | 0.065 |

ρ_{1}= 0.5 | 0.507 | 0.010 | 0.761 | 0.000 |

ρ_{2}= 0.9 | 0.873 | 0.021 | − | − |

σ^{2}_{u} | 0.129 | 0.092 | 41.11 | 26.02 |

σ^{2}_{e} | 0.008 | 0.001 | 0.008 | 0.001 |

N | 48 | 48 | ||

ll | 13.05 | 1.388 |

The estimation via REML reported in Table 5 shows that when we incorrectly fit a DGP omitting the out-group effects **W**_{l} **Y** and **W**_{l} **Z**, this causes us to overstate the effect of the endogenous lag **WY** as is apparent in the upwardly biased estimate of ρ_{1}. In addition, the inflated value of the estimated random effects parameter σ^{2}_{u} might also be because it is also capturing an omitted variable, so that it is not a truly (unobserved) group effect. Therefore, care must be taken in HLM estimation when assuming group independence.

### 10. CONCLUSIONS

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. W WITHIN SPATIAL ECONOMETRIC SPECIFICATIONS
- 3. PUTTING SOME ECONOMICS INTO W
- 4. DO WE NEED W?
- 5. CHOOSING W EMPIRICALLY
- 6. ALTERNATIVES TO W
- 7. HIERARCHICAL MODELS
- 8. MODELING SPATIAL EFFECTS THROUGH THE ERROR TERM
- 9. LINKING SPATIAL MODELS TO HIERARCHICAL MODELS WITH INTERACTION EFFECTS
- 10. CONCLUSIONS
- REFERENCES

We ask the question, “Where is the economics in spatial econometrics?” Our answer is that economic theory is to be found underpinning many spatial econometric models, but it is evident that when it comes to the so-called **W** matrix, the economic foundation of many models is at its weakest. Modeling spatial interaction in the economic context means in many cases modeling externalities and spillovers. These are elusive and difficult to pin down, which is probably why we have considerable difficulties in defining the structure of **W** unambiguously. The difficulty of detecting and measuring spatial spillover phenomena was recognized by Krugman (1991), who famously remarked that knowledge flows “leave no paper trail by which they can be measured or tracked, and there is nothing to prevent the theorist from assuming anything about them that she likes.” We have called for a stronger more theoretical basis for **W** to supplement the very significant atheoretical empirical foundations that dominate, something that might emerge from current work on games, network formation, dynamics and equilibria that is occurring within the social science, notably within the economics of networks. We have attempted to show that the concept of the **W** matrix is however undeniably necessary in one form or another and is in any case almost inescapable. It first comes to our attention as a convenient, useful, and succinct representation of spatial interaction, either in the form of endogenous or exogenous lagged variables, and/or as part of an explicit error process.

However, we also find that the **W** matrix has a pervasive presence extending far beyond cross-sectional models to multiequation models and panel data models. The denial of **W** does not completely eliminate it, merely suppresses it only to reappear as assumptions about distance and interaction in competing or complementary approaches, with a few exceptions, such as SUR, vector autoregressions and vector error correction models applied to multivariate time series. Moreover, we have shown that it exists not only trivially in implied form in autoregressive time series processes, but is, we now discover, a cognate part of hierarchical models. We have considered in some detail the connection between hierarchical models and the standard spatial econometric specifications, because hierarchical models are almost completely absent from the spatial econometrics literature, and represent one major alternative way of capturing spatial effects, focusing on the multilevel aspects of causation that are a reality of many spatial processes. Our contribution in this regard is to show that **W** appears also in a specific form as part of the structure of hierarchical models. This highlights the limitations of much of current HLM, which should benefit from cross-fertilization from the spatial econometrics literature, with the prospect of a whole new research agenda embodying differential spatial dependence within and between groups in the multilevel context.

- 1
The paper by Pinkse and Slade (2010) raises some additional difficult-to-resolve problems arising from a limited and somewhat atypical selection of the so-called ‘spatial econometrics’ literature.

- 2
Over the past decade, there has been a development of methods that have enabled researchers to model hierarchical data. Examples of these methods include multilevel models (see, e.g., Goldstein, 1998), random coefficient models (Longford, 1993), and hierarchical multilevel models proposed by Goldstein (1986) based on iterative generalized least squares (IGLS).

- 3
For example, Holly, Pesaran, and Yamagata (2010) suggest a weighting matrix in a house-price equation where connections between London and other U.K. regions are based on the inflow and outflow of commuters.

- 4
Bhattacharjee and Holly (2006) find evidence of strategic spillovers across the members of the Bank of England's Monetary Policy Committee in the way they vote on interest rate changes.

- 5
Units are neighbors under Rook's criterion if they share the same borders.

- 6
The method is currently implemented in the software MLwiN.

- 7
Other influences are the so-called peer-influence effects that have been extensively examined both in education (Bénabou, 1993), in the psychology literature (Brown, Clasen, and Eicher, 1986; Brown, 1990), and in the occurrence of social pathologies (Krosnick and Judd, 1982; Bauman and Fisher, 1986; Jones, 1994).

- 8
See Snijders and Bosker (1999, p. 53) for the derivation of the between-group regression in a multilevel setting.

- 9
Technical Appendix A with the proof of equation (45) is available at http://sites.google.com/site/luisacorrado/publications.

- 10
By allowing for the possibility that the conditional mean of group effect and the individual effect vary with group size, Graham (2008) also allows for the possibility that peer-group effects may differ according to class size, being stronger in bigger classes. A way to detect the presence of group effects in this case is to measure the excess between-group variance defined as the ratio of unconditional (scaled) between-group and within-group variances

- 11
Spatial autocorrelation in the error term exists because

*Y*depends on_{ij}*u*, which also affects and through the coefficients ρ_{j}_{1}and ρ_{2}. - 12
Technical Appendix B with the proof of equation (49) is available at http://sites.google.com/site/luisacorrado/publications.

- 13
Note that the same result can also be obtained by simulating

**Y**from the reduced from (50):- (54)

where and

**u**′= (**I**− (ρ_{1}+ρ_{2})**W +**ρ_{2}**W**_{l})^{−1}(**We + u**)**.** - 14
The code to generate the data for the estimations in Table 5 is available at http://personal.strath.ac.uk/bernard.fingleton/ and http://sites.google.com/site/bernardfingleton/software.

### REFERENCES

- Top of page
- Abstract
- 1. INTRODUCTION
- 2. W WITHIN SPATIAL ECONOMETRIC SPECIFICATIONS
- 3. PUTTING SOME ECONOMICS INTO W
- 4. DO WE NEED W?
- 5. CHOOSING W EMPIRICALLY
- 6. ALTERNATIVES TO W
- 7. HIERARCHICAL MODELS
- 8. MODELING SPATIAL EFFECTS THROUGH THE ERROR TERM
- 9. LINKING SPATIAL MODELS TO HIERARCHICAL MODELS WITH INTERACTION EFFECTS
- 10. CONCLUSIONS
- REFERENCES

- 1997. “Social Distance and Social Decisions,” Econometrica, 65, 1005–1028.
- 1991. “Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation,” Econometrica, 59, 817–858.
- 1986. “Non-nested Tests on the Weight Structure in Spatial Autoregressive Models,” Journal of Regional Science, 26, 267–284. .
- 1988.
*Spatial Econometrics: Methods and Models*. The Netherlands : Kluwer Academic Publishers. . - 1999. “The Future of Spatial Analysis in the Social Sciences,” Geographic Information Sciences, 5, 67–76. .
- 2003. “Spatial Externalities,” International Regional Science Review, 26, 147–152. .
- 2010. “Thirty Years of Spatial Econometrics,” Papers in Regional Science, 89, 3–25. .
- 2002. “Spatial Effects and Ecological Inference,” Political Analysis, 10, 276–303. and
- 2007. “Spatial Panel Econometrics,” in Lászlo Matyas and (eds.), The Econometrics of Panel Data, Fundamentals and Recent Developments in Theory and Practice. Dordrecht : Kluwer, 3rd edn., pp. 625–660. , , and .
- 2003. “Properties of Tests for Spatial Error Components,” Regional Science and Urban Economics, 33, 595–618. and .
- 2007. “No Man is An Island: The Inter-personal Determinants of Regional Well-Being in Europe,” Cambridge Working Papers in Economics 0717, Faculty of Economics, University of Cambridge. and .
- 2006. “Prediction in the Panel Data Model with Spatial Correlation: The Case of Liquor,” Spatial Economic Analysis, 1, 75–185. and .
- 2010. “Increasing Returns and the Spatial Structure of French Wages,” Spatial Economic Analysis, 5, 73–79. .
- 1986. “On the Measurement of Friend Behavior in Research on Friend Influence and Selection: Findings from Longitudinal Studies of Adolescent Smoking and Drinking,” Journal of Youth and Adolescence, 15, 345–353. and .
- 1993. “Workings of a City: Location, Education, and Production,” Quarterly Journal of Economics, CVIII, 619–652. .
- 1994. “A Theory of Conformity,” Journal of Political Economy, 102, 841–877. .
- 2006. “Taking Personalities out of Monetary Policy Decision Making? Interactions, Heterogeneity and Committee Decisions in the Bank of England MPC,” Centre for Dynamic Macroeconomic Analysis Working Paper Series 0612, School of Economics and Finance, University of St. Andrews. and .
- 1997. “Correlated Cluster Effects and Simultaneity in Multilevel Models,” Health Economics, 1, 6–13. and .
- 1975. “Estimation of the Coefficients of a Linear Regression in the Presence of Spatial Autocorrelation: An Application to a Belgium Labor Demand Function,” Environment and Planning, 7, 455–472. and .
- 2001. “Interactions-Based Models,” in James J. Heckman, and (eds.), Handbook of Econometrics, Vol. 5. Amsterdam: Elsevier Science, pp. 3297–3380. and .
- 1990. “Peer Groups and Peer Cultures,” in S. Shirley Feldman, and (eds.), At the Threshold: the Developing Adolescent. Cambridge MA : Harvard University Press, pp. 171–196.
- 1986. “Perceptions of Peer Pressure, Peer Conformity Dispositions and Self-Reported Behavior Among Adolescents,” Developmental Psychology, 22, 521–530. , , and .
- 1992. Hierarchical Linear Models for Social and Behavioral Research: Applications and Data Analysis Methods. Newbury Park , CA : Sage Publications. and .
- 2010. “Bootstrap Inference in Spatial Econometrics: the J Test,” Spatial Economic Analysis, 5, 93–119. and .
- 2001. “Sources of Sectoral Employment Fluctuations in Central Cities and Suburbs: Evidence from Four Eastern U.S. Cities,” Journal of Urban Economics, 49, 199–218. and .
- 1973. Spatial Autocorrelation. London : Pion. and .
- 1981. Spatial Processes: Models and Applications. London : Pion. and .
- 2006. “Multiple Groups Identification in the Linear-in-Means Model,” Economics Letters, 92, 157–162. .
- 1999. “GMM Estimation with Cross-Sectional Dependence,” Journal of Econometrics, 92, 1–45.
- 2002. “Economic Distance, Spillovers and Cross Country Comparisons,” Journal of Economic Growth, 7, 157–187. and .
- 2002. “Socio-economic Distance and Spatial Patterns in Unemployment,” Journal of Applied Econometrics, 17, 303–327. and .
- 2009. “A Unified Solution to the Identification and Estimation of Social Interaction Effects,” Centre for Economic & International Studies Working Paper Series, University of Rome Tor Vergata. .
- 2010. “Multilevel Modelling with Spatial Effects,” University of Strathclyde, Discussion Papers in Economics, No. 11-05. and .
- 2003. Statistics for Spatial Data. New York : Wiley. .
- 2007. “Growth, Technological Interdependence and Spatial Externalities: Theory and Evidence,” Journal of Applied Econometrics, 22, 1033–1062. and .
- 2001. “Equilibrium and Economic Growth: Spatial Econometric Models and Simulations,” Journal of Regional Science, 41, 117–147. Direct Link: .
- 2003. “Increasing Returns: Evidence from Local Wage Rates in Great Britain,” Oxford Economic Papers, 55, 716–739. .
- 2006. “The New Economic Geography Versus Urban Economics: An Evaluation Using Local Wage Rates in Great Britain,” Oxford Economic Papers, 58, 501–530. .
- 2008. “A Generalized Method of Moments Estimator for a Spatial Model with Moving Average Errors, with Application to Real Estate Prices,” Empirical Economics, 34, 35–57. .
- 2009. “Spatial Autoregression,” Geographical Analysis, 41, 385–391. .
- 2008. “Estimating Spatial Models with Endogenous Variables, a Spatial Lag and Spatially Dependent Disturbances: Finite Sample Properties,” Papers in Regional Science, 87, 319–339. and .
- 2009. “Endogeneity in a Spatial Context: Properties of Estimators,” in Antonio Páez, , , and . (eds.), Progress in Spatial Analysis: Theory and Computation, and Thematic Applications, Advances in Spatial Sciences. Berlin : Springer-Verlag, pp. 59–73. and .
- 2006. “Empirical Growth Models with Spatial Effects,” Papers in Regional Science, 85, 177–198. and .
- 2008. “How to Get Rid of W: A Latent Variables Approach to Modelling Spatially Lagged Variables,” Environment and Planning A, 40, 2526–2538. and .
- 2010. “Mostly Pointless Spatial Econometrics?” Working paper. and .
- 1986. “Multilevel Mixed Linear Model Analysis Using Iterative Generalised Least Squares,” Biometrika, 73, 46–56. .
- 1989. “Restricted Unbiased Iterative Generalized Least-Squares Estimation,” Biometrika, 76, 622–623. .
- 1998. “Multilevel Models for Analysing Social Data,” in Encyclopaedia of Social Research Methods. Newbury Park , CA : Sage Publications. .
- 2009. Connections: An Introduction to the Economics of Networks. Princeton : Princeton University Press. .
- 2008. “Identifying Social Interactions through Conditional Variance Restrictions,” Econometrica, 76, 643–660.
- 1987. The Economics of Imperfect Competition: A Spatial Approach. Cambridge : Cambridge University Press. , , and .
- 2010. “In Search of W,” Spatial Economic Analysis, forthcoming. , , and .
- 2010. “Spatial and Temporal Diffusion of House Prices in the UK,” Journal of Urban Economics, 69(1), 2–23. , , and .
- 2008. “Random Coefficient Panel Data Models,” in László Matyas and (eds.), The Econometrics of Panel Data: Fundamentals and Recent Developments in Theory and Practice, 3rd edn. Berlin : Springer Publishers, pp. 185–213. and .
- 1994. “Health, Addiction, Social Interaction, and the Decision to Quit Smoking,” Journal of Health Economics, 13, 93–110.
- 1992. “Social Norms and Community Enforcement,” Review of Economic Studies, 59, 63–80. .
- 2005. “Alternative Approaches to Estimation and Inference in Large Multifactor Panels: Small Sample Results with an Application to Modelling of Asset Results,” in Garry D.A. Phillips, and (eds.), The Refinement of Econometric Estimation and Test Procedures: Finite Sample and Asymptotic Analysis. Cambridge : Cambridge University Press, pp. 239–281. and .
- 2007. “Panel Data Models with Spatially Correlated Error Components,” Journal of Econometrics, 140, 97–130. , , and .
- 2008. “A Spatial J-test for Model Specification against a Single or a Set of Non-Nested Alternatives,” Letters in Spatial and Resource Sciences, 1, 3–11.
- 1998. “A Generalized Spatial Two-Stage Least Squares Procedure for Estimating a Spatial Autoregressive Model with Autoregressive Disturbances,” Journal of Real Estate Finance and Economics, 17, 99–121. and .
- 2007. “HAC estimation in a Spatial Framework,” Journal of Econometrics, 140, 131–154. and .
- 1995. “Spatial Correlation: A Suggested Alternative to the Autoregressive Model,” in Luc Anselin and (eds.), New Directions in Spatial Econometrics. Berlin : Springer-Verlag, pp. 75–95. and .
- 1982. “Transitions in Social Influence in Adolescence: Who Induces Cigarette Smoking,” Developmental Psychology, 81, 359–368. and .
- 1991. Geography and Trade. Cambridge , MA : MIT Press. .
- 2002. “Modeling Social Influence through Network Autocorrelation: Constructing the Weight Matrix,” Social Networks, 24, 21–47.
- 2008. “Spatial Growth Regressions: Model Specification, Estimation and Interpretation,” Spatial Economic Analysis, 3, 275–304. and .
- 1999. “A Spatial Prior for Bayesian Autoregressive Models,” Journal of Regional Science, 39, 297–317. Direct Link: and .
- 2008. “Spatial Econometric Modeling of Origin-Destination Flows,” Journal of Regional Science, 48, 941–967. and .
- 2009. Introduction to Spatial Econometrics. Boca Raton , FL : CRC Press. and .
- 2007. “Bayesian Model Averaging for Spatial Econometric Models,” Geographical Analysis, 39, 241–267. and .
- 1993. Random Coefficient Models. London : Claredon Press.
- 1993. “Identification of Endogenous Social Effects: The Reflection Problem,” Review of Economic Studies, 60, 531–542.
- 2000. “Economic Analysis of Social Interactions,” Journal of Economic Perspectives, 3, 115–136.
- 1974. “Conditional Logit Analysis of Qualitative Choice Behavior,” in Paul Zarembka (ed.), Frontiers in Econometrics. New York : Academic, pp. 105–142.
- 1979. “Qualitative Methods for Analysing Travel Behaviour of Individuals: some Recent Developments,” in S. Hensher (ed.), Behaviour Travel Modelling. London : Croom Helm , pp. 279–318.
- 2010a. “Issues in Spatial Data Analysis,” Journal of Regional Science, 50(1), 119–141.
- 2010b. “Perspectives on Spatial Econometrics: Linear Smoothing with Structural Models,” Working paper.
- 1986. “Random Group Effects and the Precision of Regression Estimates,” Journal of Econometrics, 32, 385–397.
- 1990. “An Illustration of a Pitfall in Estimating the Effects of Aggregate Variables on Micro Unit,” The Review of Economics and Statistics, 72, 334–338.
- 1975. “Estimation Methods for Models of Spatial Interaction,” Journal of the American Statistical Association, 70, 120–126. .
- 2008. “Omitted Variable Bias of OLS and Spatial Lag Models,” in A. Páez, , , and . (eds.), Progress in Spatial Analysis: Theory and Methods and Thematic Applications. Berlin : Springer, pp. 17–28. and .
- 2009. “Knowledge Flows across European Regions,” Annals of Regional Science, 43, 669–690. and .
- 2008. “Using the Variance Structure of the Conditional Autoregressive Specification to Model Knowledge Spillovers,” Journal of Applied Econometrics, 23, 235–256. and .
- 2007. “Network Analysis of Commuting Flows: A Comparative Static Approach to German Data,” Networks and Spatial Economics, 7, 315–331. , , , , and .
- 2007. “A Simple Panel Unit Root Test in the Presence of Cross-Sectional Dependence,” Journal of Applied Econometrics, 138, 2265–2312. .
- 2010. “The Future of Spatial Econometrics,” Journal of Regional Science, 50(1), 103–1117. and .
- 2002. “Spatial Price Competition: a Semiparametric Approach,” Econometrica, 70, 1111–1153. , , and .
- 2005. “The Role of Economic Space in Decision Making, ADRES Lecture,” Annales D'Èconomie et De Statistique, 77, 1–20.
- 1999. Multilevel Analysis: an Introduction to Basic and Advanced Multilevel Modeling, London : Sage Publications. and .
- 2004. “A Bayesian Probit Model with Spatial Dependencies,” in James LeSage and (eds.), Advances in Econometrics, Volume 18: Spatial and Spatiotemporal Econometrics. Oxford : Emerald Group Publishing, pp. 127– 160 . and .
- 1967. “A Statistical Theory of Spatial Distribution Models,” Transportation Research, 1, 253–269.
- 1970. Entropy in Urban and Regional Modelling. London : Pion.