Improved Lagrange Multiplier Tests in Spatial Autoregressions

For testing lack of correlation against spatial autoregressive alternatives, Lagrange multiplier tests enjoy their usual computational advantages, but the (χ2) first�?order asymptotic approximation to critical values can be poor in small samples. We develop refined tests for lack of spatial error correlation in regressions, based on Edgeworth expansion. In Monte Carlo simulations, these tests, and bootstrap tests, generally significantly outperform χ2�?based tests.


INTRODUCTION
The spatial autoregressive (SAR) model is a parsimonious tool for describing spatial correlation, conveniently depending only on economic distances rather than geographical locations, which might be unknown or irrelevant. Thus, it provides a convenient, widely usable class of alternatives in testing the null hypothesis of spatial uncorrelatedness, which, if true, considerably simplifies statistical inference. A linear regression with SAR disturbances is given by where y = (y i ) is an n × 1 vector of observations, X = (x ij ) is an n × k matrix of nonstochastic regressors, β is a k × 1 vector of unknown parameters, = ( i ) is an n × 1 vector of unobservable, mutually independent, random variables, with zero mean and unknown variance σ 2 , λ is an unknown scalar, and W = (w ij ) is a given n × n weight matrix, such that w ii = 0, 1 ≤ i ≤ n, and typically satisfying normalization restrictions (which aid identification of λ). A special case of (1.1) is pure SAR, or SAR for y, when β = 0 a priori, and SAR for y with constant mean, when k = 1 and X = l, the n × 1 vector of ones, i.e., y − βl = λW (y − βl) + .
(1.3) When W is row normalized such that W l = l, (1.3) becomes the intercept model where γ = (1 − λ)β. When λ = 0, (1.1) implies that y i are spatially correlated, but under the null hypothesis H 0 : λ = 0, (1.5) they are mutually independent. Various tests of (1.5) have been discussed in the literature (see, e.g., Moran, 1950, Cliff and Ord, 1972, Burridge, 1980, Kelejian and Robinson, 1992, and Pinkse, 2004. For example, Wald and likelihood ratio (LR) tests have been developed, assuming that i are normally distributed (e.g., Ord, 1975). However, these involve the maximum likelihood (ML) estimate of λ, β and σ 2 , which is not defined in closed form, and the likelihood need not necessarily be unimodal. Although Lagrange multiplier (LM) tests, following Moran (1950), are not guaranteed to be consistent against all violations of (1.5), and can have low power near inconsistent alternatives, they share the optimal local efficiency properties of Wald and LR tests while being computationally simpler, involving closed-form estimates of β and σ 2 . Anselin (2001) surveyed LM testing in SAR models. Under (1.5) and regularity conditions, LM, Wald and LR statistics against the two-sided alternative H 1 : λ = 0, (1.6) each have a null limiting χ 2 1 distribution as n → ∞, and provide consistent tests. Frequently, however, spatial economic data sets are not very large, and the χ 2 approximation might be inaccurate. This is of particular concern in the SAR setting where convergence to the limit distribution can be slower than the classical parametric rate (as found for the ML estimate in SAR models by Lee, 2004). Table 1 reports simulated sizes of Wald, LR and LM tests of (1.5) for SAR y, (1.2) with i ∼ N (0, 1) and 1000 replications, and W follows the Case (1991) specification where I s denotes the s × s identity matrix and l m is the m × 1 vector of ones, so n = mr. In (1.7), r might represent the number of districts and m the number of households per district, so households are neighbours if and only if they belong to the same district, and neighbours are equally weighted. The four (m, r) combinations in Table 1, corresponding to n = 40, 96, 198 and 392, are designed to reflect an asymptotic regime where convergence is slower than the parametric rate, as discussed subsequently. The empirical sizes are to be compared with the nominal α = 5%, so the χ 2 approximation is not very good, with Wald and LR being over-sized and LM under-sized, and Wald and LM exhibiting little improvement with increasing n, and LR none. Thus, the issue of constructing tests that enjoy good-sized properties in modest samples seems worth pursuing. In this paper we start from the LM statistic because of its computational advantages and local efficiency, noting also that its signed square root is locally best invariant (King and Hillier, 1985). Ad hoc finite sample corrections for LM tests have already been derived in the spatial econometrics literature. Robinson (2008) considers a wide class of residual-based, asymptotically χ 2 statistics, which include LM statistics for testing (1.5) in SAR models as special cases, and suggests transformed statistics, which are still asymptotically χ 2 , but have exactly the mean and variance of a χ 2 variate and are therefore expected to have improved finite sample properties. Baltagi and Yang (2013), in line with Koenker (1981), derive a standardized version of the square root of the LM statistic for testing (1.5) in a broad class of SAR-type models, which brings the mean exactly to zero and the variance closer to that of the normal limiting variate. Our main contribution is to develop tests based on the Edgeworth expansion of the distribution function of the LM statistic. We focus on tests against (1.6), but results for one-sided alternatives are simple corollaries. Our Edgeworth-corrected tests are also compared in Monte Carlo simulations with bootstrap-based tests, which are expected to achieve a similar refinement (see, e.g., Singh, 1981, andHall, 1992). Despite the advantages of bootstrap-based tests, we believe that our analytical approach is worthwhile because it sheds light on the magnitude of correction terms and offers insight into the adequacy of the standard χ 2 approximation for different choices of W , while our refined test statistics are still relatively simple and require no further nuisance parameter estimates, and perform comparably to bootstrap ones in small and moderately sized Monte Carlo samples.
The derivation of the Edgeworth expansion for the distribution of LM under (1.5) and corrected tests are the focus of the following section. The proofs of the theorems are left to the Appendix. In Robinson and Rossi (2013), hereafter RR, Edgeworth-corrected tests of (1.5) in (1.2) and (1.4) are developed, based on the least-squares estimate of λ. While this estimate converges in probability to zero under (1.5), it is inconsistent, not converging in probability to λ when λ = 0. In Section 3, we derive the finite sample corrections of Robinson (2008) in the SAR case, so as to compare performance with Edgeworth-corrected tests. Some results on local power are presented in Section 4. A Monte Carlo comparison of the various tests is reported in Section 5. Section 6 contains final comments.

EDGEWORTH EXPANSION AND CORRECTED TESTS
The LM statistic for testing (1.5) in (1.1) against (1.6) is y P W Py y Py , (2.1) where P = I − X(X X) −1 X , I = I n ; in (1.2) P = I and in (1.3) P = I − l(l l) −1 l . The statistic LM was derived by Burridge (1980), who noted that it is equivalent to that of Cliff and Ord (1972), which in turn is related to a statistic of Moran (1950). For extensions to more general models, see also Anselin (1988Anselin ( , 2001, Baltagi and Li (2004), and Pinkse (2004). As noted by Burridge (1980), (2.1) is also the LM statistic for testing (1.5) against the spatial moving average model u = + λW (a corresponding equivalence to that found with time series models). The derivation of (2.1) is based on a Gaussian likelihood but as usual its first-order limit distribution obtains more generally. Under suitable conditions, we have as n → ∞ for any η > 0, where denotes the distribution function (df) of a χ 2 1 random variable. Thus, (1.5) is rejected in favour of (1.6) if LM exceeds the appropriate percentile of the χ 2 1 distribution. We can likewise test (1.5) against a one-sided alternative, λ > 0 (< 0), by comparing T (−T ) with the appropriate N (0, 1) upper (lower) percentile. The present paper mainly focuses on two-sided tests.
We omit mild sufficient conditions for (2.2), because we wish to consider statistics with better finite-sample properties and we only justify these under the following restrictive assumption.
The normality assumption is common in higher-order asymptotic theory because Edgeworth expansions and resulting test statistics are otherwise complicated by the presence of cumulants of i .
For a real matrix A = (a ij ), let ||A|| be the spectral norm of A (i.e., the square root of the largest eigenvalue of A A) and let ||A|| ∞ be the maximum absolute row sums norm of A (i.e., ||A|| ∞ = max i j |a ij |, where i and j vary respectively across all rows and columns of A). We introduce the following assumption. If W is row normalized such that W l = l, with w ij = w ji ≥ 0, all i, j, (as in (1.7)), part (b) is automatically satisfied. The sequence h defined in (c) can be bounded or divergent, and this distinction affects the rate of convergence to the null distribution, the order of the leading Edgeworth correction term being h/n. For W given by (1.7), h ∼ m, explaining our remark that the (m, r) used in Table 1, where m increases, slowly, with n, correspond to slow convergence.
In addition, we impose a standard boundedness and lack-of-multicollinearity condition on X. Throughout, K denotes a finite generic constant. We introduce the following assumption. ASSUMPTION 2.3. Uniformly in i, j , n, |x ij | ≤ K, and as n → ∞ , ||(X X/n) −1 || −1 = O (1).

For notational convenience, define
To ensure that leading terms appearing in the following theorem are well defined, we introduce the following assumption. (2.10) Both ω 1 (.) and ω 2 (.) are generally non-homogeneous quadratic functions of η with known coefficients.
in case h → ∞ as n → ∞, and Because (2.11) and (2.12) entail better approximations than (2.2) and depend on known quantities, they can be used directly in approximating the df of LM. The two outcomes in Theorem 2.1 create a dilemma for the practitioner because it cannot be determined for finite n whether to treat h as divergent or bounded. However, (2.12) is justified also when h is divergent because the extra term in the expansion, −2((k + 2)η − η 2 )/n, is o(h/n). We retain both (2.11) and (2.12) to stress the possible dependence of our expansion on both n and h, which is peculiar in SAR models, and the slow convergence of LM in case h is divergent.
To derive corrected tests, define w α such that P (LM ≤ w α ) = 1 − α, so a test that rejects (1.5) when LM > w α has exact size α. Let (z α ) = 1 − α, where denotes the standard normal df. From (2.2), a test based on (2.1) that rejects H 0 in (1.5) against (1.6) when has approximate size α. Theorem 2.1 can be used to derive approximations of w α that are more accurate than z 2 α/2 (see Cordeiro and Ferrari, 1991). For h divergent and bounded define, respectively, and  has size that is closer to α than (2.14).
As an alternative to correcting critical values, we can apply Theorem 2.1 to construct a monotonic transformation of LM whose distribution better approximates χ 2 1 than that of LM itself (see, e.g., Kakizawa, 1996 is more accurate than (2.14).

MOMENTS-BASED CORRECTION
Robinson (2008) proposed both mean-adjusted and mean-and-variance-adjusted variants of (2.1), which might be expected to have better finite sample properties than (2.1), while still being asymptotically χ 2 1 . Because mean adjusting alone might, for smallish n, increase variance, offsetting the gain in accuracy from centring, we focus on the mean-and-variance correction. Such corrected statistics are theoretically convenient because under (1.5), (2.1) depends on the ratio P W P / P , which is independent of its denominator, so its moments can be explicitly calculated (Pitman, 1937).
The mean-and-variance-adjusted statistic in Robinson (2008) when h → ∞ as n → ∞, and By formulae for moments of normal quadratic forms (see, e.g., Ghazal, 1996), when h is divergent, and (3.9) By construction, LM 1 and LM 2 have mean and variance that are closer to those of a χ 2 1 random variable than LM, so we expect the test that rejects H 0 when where i = 1 for h divergent and i = 2 for h bounded, will have size closer to α than (2.14).

ANALYSIS OF LOCAL POWER
We now focus on testing (1.5) in (1.1) against the local alternatives where S(x) = I − xW , because for n large enough |λ n | < 1 and existence of S −1 (λ n ) is guaranteed by Assumption 2.2. For Z ∼ N (0, 1), denote by (x; ν) the df of (Z + ν) 2 , the noncentral χ 2 1 random variable with non-centrality parameter ν, its probability density function (pdf) being Define also as n → ∞.
The first-order asymptotic approximation to the df of LM under H 1 (4.1) has error O((h/n) 1/2 ). Terms of higher order could be derived at expense of considerable algebraical complication.
Theorem 4.1 can be used to derive a more accurate approximation for the local power of the LM test of H 0 against (4.1). Define the power function ( From Theorem 4.1, the test in (2.14) has local power Even the signs of the correction terms can vary with W , but the terms can be numerically evaluated for any given W . It is therefore possible to establish whether the actual local power of (2.14) is likely to be higher or lower than that of (2.14). It is worth stressing that (4.8) holds also in case of tests (2.19), (2.23) and (3.10) because the extra terms implied by the size corrections would be of order o((h/n) 1/2 ). Hence, tests (2.19), (2.23) and (3.10) have sizes that are closer to α than (2.14), which has local power as in (4.8). This paper is concerned with refinements of the LM test, and a comparison between its higher-order power with other existing tests of (1.5) is beyond our scope. However, Theorem 4.1 can be useful for further studies on higher-order efficiency of tests of H 0 (1.5) in SAR models, along the lines of, e.g., Peers (1971), Taniguchi (1991 or Rao and Mukerjee (1994).

BOOTSTRAP CORRECTION AND SIMULATIONS
We have carried out Monte Carlo simulations to investigate the finite sample performance of the tests developed above, and bootstrap tests. The Monte Carlo design, and initial bootstrap   . . . , 199, (5.1) each u * j being a vector of independent N (0, y Py/n) variables. For α = 0.05, denote by w * α the largest value solving 199 j =1 1(LM * ≤ w * α )/199 ≤ 1 − α, with 1(.) denoting the indicator function. We reject H 0 (1.5) against (1.6) when We choose W as in (1.7), whence h = m − 1, W is symmetric, satisfies W l = l and has non-negative elements. Because the tests derived in the previous sections can vary depending on whether h is divergent or bounded, we reflect both cases in our choices of (m, r). We choose (m, r) = (8, 5), (12, 8), (18, 11) and (28, 14) (as in Table 1, and corresponding to n = 40, 96,198,392) to represent divergent h, and (m, r) = (5, 8), (5, 20), (5, 40) and (5, 80) (which correspond to n = 40, 100, 200, 400) to represent bounded h. As in Table 1, the i were generated as N (0, 1), and results are based on 1000 replications. In the tables, we denote by chi square, Edgeworth, transformation, mean-variance correction and bootstrap the empirical sizes of tests (2.14), (2.19), (2.23), (3.10) and (5.2), respectively. In the text, we use the respective abbreviations C, E, T, MV and B. Tables 2-7 report empirical sizes of the tests for models (1.1), (1.2) and (1.3).    Tables 2 and 3 concern the regression model with SAR disturbances (1.1), where k = 3, with X having first column l, and elements of the other two columns generated independently and uniformly [0, 1], when h (and thus m in (1.7)) is divergent and bounded, respectively. The standard test C is considerably under-sized in both cases, and the overall pattern of the results is consistent with the results in Theorem 2.1, where the df of LM converges at rate n when h is bounded and at the slower n/ h when h is divergent. Indeed, from the first row of Table 2, as n increases from n = 40 to n = 392, the deviation between empirical and nominal sizes only decreases by 47%, while from the first row of Table 3, such deviation decreases by 85% when n increases from n = 40 to n = 400. The Edgeworth-corrected tests E and T seem to perform very well in both cases, offering an average (across sample sizes considered) respective improvement over C of 52% and 54% when h is divergent, and of 52% and 50% when h is bounded. The MV test is very under-sized, the discrepancy between actual and nominal values decreasing by only 2% and 18% for divergent and bounded h, respectively, compared to C. The average improvement offered by B is 71% when h is divergent, and 50% when h is bounded and its performance is comparable (or even superior, in case h is divergent) to E and T. Overall, E, T and B perform very well.
Tables 4 and 5 concern pure SAR (1.2) for divergent and bounded h, respectively. Although less severely than in Tables 2 and 3, C is under-sized for all n. When h is divergent and as n increases from n = 40 and n = 392, the deviation between actual and nominal values decreases by 40%, while when h is bounded and n increases from n = 40 to n = 400 it decreases by 75%, consistently with Theorem 2.1 (with d = e = f = k = 0). Also, when h is divergent sizes for E, T, MV and B are, respectively, on average, across the sample sizes considered, 57%, 42%, 14% and 69% closer to 0.05 than those for C. Such figures become 61%, 51%, 22% and 60% when h is bounded. In both cases, the performance of E, T and B is satisfactory, with B and E offering the greatest improvement when h is divergent and bounded, respectively. The test MV, again, is less satisfactory than T, E and B, even though its performance is slightly better than that in Tables  2 and 3.
Tables 6 and 7 concern the intercept model (1.3)/(1.4) for divergent and bounded h, respectively. The pattern remains similar. On average, across the sample sizes considered, for E, T and B, the discrepancies between actual and nominal values are reduced by 65%, 46% and 74% when h is divergent, and by 57%, 88% and 52% when h is bounded. Overall, E, T and B perform well, with B offering the highest improvement when h is divergent and T considerably outperforming both E and B when h is bounded. Surprisingly, when h is divergent, the MV test is outperformed by C; on average, the empirical sizes for C are 28% closer to the nominal values than those for MV. However, when h is bounded MV offers an average improvement of 45% over C.
In Tables 8-13, we examine powers of (the non-size-corrected tests) C, E, T, MV and B against H 1 : λ =λ = 0, (5.3) forλ = 0.1, 0.5 and 0.8. Tables 8 and 9 concern the same regression setting as in Tables 2 and 3. We observe that C, E, T and B perform well for all n, with C slightly the worst. The few exceptions occur forλ = 0.1, where E and T are outperformed by C for (m, r) = (18, 11) and (m, r) = (5, 80), respectively. MV, instead, is outperformed by C for all sample sizes in almost all settings. Overall, B seems to offer the highest power.        Tables 10 and 11 concern pure SAR (1.2). Again, MV has, overall, the lowest power. More interestingly, when h is divergent, forλ = 0.1 andλ = 0.5 E and T offer a slightly lower power than the standard test C for some sample sizes. In turn, C is outperformed by B for all sample sizes and all choices ofλ. When h is bounded, instead, E, T and B have comparable performances and are superior to C.
Tables 12 and 13 concern the intercept model (1.3)/(1.4). Similarly to Tables 8-11, MV performs worst overall. When h is divergent, C has lower power than E, T and B, with a few exceptions in which E and T perform slightly worse than C (i.e., forλ = 0.5 when (m, r) = (12, 8) and (m, r) = (28, 14)). Overall, when h is divergent, B seems to have the highest power. The pattern of the results for bounded h is similar to Table 11, with E, T and B having similar performance and offering higher power than C.
Comparisons can be made with the Monte Carlo results reported in RR. The settings only overlap to a limited extent, because RR studied only (1.2) and (1.4), not more general regression models, and they did not look at MV-type tests. However, they did include tests of the one-sided alternative λ > 0. Subject to this, we can compare the results in Tables 4-7 with the results of RR. Generally, their tests corresponding to our C tests are very over-sized, especially for the intercept model. Their Edgeworth and transformation tests are much improved, although still quite poorly sized for the smallest n, and on the whole our tests also perform better here. The bootstrap results are closer, with the LM tests doing better in 10 out of 16 cases.
(5.4)  bootstrap statistics are obtained as in (5.1), but with each u * j generated by resampling with a replacement from the (centred) empirical distribution of Py.
Tables 14 and 15 report empirical sizes when h is divergent and bounded, respectively. The Edgeworth-corrected tests improve on C; indeed, when h is divergent the empirical sizes of E and T are 51% and 41% closer to 0.05, on average, across the sample sizes considered, but improve less when h is bounded (by 29% and 24%). As expected, B offers the greatest improvements because bootstrap critical values do not reflect distributional assumptions. On average, across n, the sizes obtained by bootstrap critical values are 62% and 56% closer to 0.05 than those based on C. Our results suggest that in the present setting our normality-based Edgeworth-corrected tests E and T provide a partial correction when normality does not hold, and perform at least as well as C.
Finally, Tables 16 and 17 display empirical powers of the tests of H 0 in (1.5) for the regression setting of Tables 2 and 3 when h is divergent and bounded, respectively. For all n, the performance is similar to that in Tables 8 and 9. Except when (m, r) = (5, 80) andλ = 0.1, E and T are more powerful than C.

FINAL COMMENTS
We have derived refined LM tests of lack of correlation against SAR error correlation in regression models, using Edgeworth expansion, examined their local power, and compared their finite sample performance with other tests. The tests are based on asymptotic theory, but they do seem to improve on standard, uncorrected, tests in modest sample sizes. They are relatively simple to compute, partly because of imposing normality. Edgeworth expansions without distributional assumptions can be derived, in terms of higher-order cumulants (e.g., Knight, 1985), but estimates of the latter tend to be imprecise except in very large samples. As Ogasawara (2006a,b) found in other settings, our normal-based tests will remain valid under only slight relaxation of normality, with certain equality restrictions holding (e.g., zero fourth cumulants). Bootstrap-based tests will be valid much more generally, and rival our higher-order improvements, but bootstrap statistics do vary with implementation. We believe that empirical researchers are still likely to report the standard LM statistic and compare it with χ 2 critical values, in which case it costs little more to carry out our tests, which do not require estimation of any further nuisance parameters. In this paper, we make other restrictive assumptions. The requirement of deterministic regressors is quite standard in the SAR literature, but our results should hold after conditioning on stochastic regressors that are independent of errors. Relaxing exogeneity then becomes an issue, but Edgeworth expansions allowing endogeneity would be considerably more complicated. Allowing endogeneity of the weight matrix is also an important issue, but so far as we know serious progress on allowing this, in the context of first-order theory, has begun only recently; see Qu and Lee (2013). Other assumptions will be more straightforward to relax, such as linearity of the regression and homoscedasticity of the innovations i . Pitman, E. G. J. (1937). Significant tests which may be applied to samples from any population. Supplement to the Journal of the Royal Statistical Society 4, 119-30. Qu, X. and L-F. Lee (2013). Estimating a spatial autoregressive model with an endogenous spatial weight matrix. Working paper, Ohio State University. Rao, C. R. and R. Mukerjee (1995). Comparison of Bartlett-type adjustments for the efficient score statistic.