Volume 37, Issue 8
Research
Free Access

Testing environmental and genetic effects in the presence of spatial autocorrelation

François Rousset

E-mail address: francois.rousset@univ-montp2.fr

Inst. des Sciences de l'Evolution (UM2‐CNRS), Univ. Montpellier 2, Place Eugène Bataillon, CC 065, FR‐34095 Montpellier cedex 5, France

Inst. de Biologie Computationnelle, Montpellier, France

Search for more papers by this author
Jean‐Baptiste Ferdy

Laboratoire Évolution et Diversité Biologique, UMR 5174 CNRS – Univ. Paul Sabatier – ENFA, route de Narbonne, FR‐31062 Toulouse Cedex 9, France

Search for more papers by this author
First published: 18 February 2014
Citations: 75

Abstract

Spatial autocorrelation is a well‐recognized concern for observational data in general, and more specifically for spatial data in ecology. Generalized linear mixed models (GLMMs) with spatially autocorrelated random effects are a potential general framework for handling these spatial correlations. However, as the result of statistical and practical issues, such GLMMs have been fitted through the undocumented use of procedures based on penalized quasi‐likelihood approximations (PQL), and under restrictive models of spatial correlation. Alternatively, they are often neglected in favor of simpler but more questionable approaches. In this work we aim to provide practical and validated means of inference under spatial GLMMs, that overcome these limitations. For this purpose, a new software is developed to fit spatial GLMMs. We use it to assess the performance of likelihood ratio tests for fixed effects under spatial autocorrelation, based on Laplace or PQL approximations of the likelihood. Expectedly, the Laplace approximation performs generally slightly better, although a variant of PQL was better in the binary case. We show that a previous implementation of PQL methods in the R language, glmmPQL, is not appropriate for such applications. Finally, we illustrate the efficiency of a bootstrap procedure for correcting the small sample bias of the tests, which applies also to non‐spatial models.

Spatial autocorrelation is a well‐known concern in the modelling of the distribution of species or species richness (Keitt et al. 2002, Dormann et al. 2007, Bini et al. 2009), community structure (Robertson and Freckman 1995), and distribution of phenotypic and genetic variation (Stopher et al. 2012, Bradburd et al. 2013). It arises each time the value a response variable takes at one point in space correlates with its values in nearby localities. Spatial autocorrelation may represent the effect of unobserved predictor variables that themselves exhibit spatial autocorrelation. Alternatively, the response may have identical expectation everywhere, but may fluctuate randomly and in a correlated manner in nearby positions when its value in any place depends on the realized values in nearby positions at some earlier time. In both cases, the standard hypothesis of independence in errors is violated and simple statistical tools are inappropriate. It then becomes difficult to infer and test properly the effect of the predictor variable on the response variable. This problem is well recognized in population biology, and many approaches have been described to address it (see Dormann et al. 2007 for a survey), but much fewer have been validated.

One way to model spatial autocorrelation in the response variable is to consider that it results from random effects that are spatially correlated. Generalized linear mixed models (GLMMs) with spatially autocorrelated random effects are therefore a potential general framework for handling these spatial correlations. However, as summarized by Bolker et al. (2009), complex GLMMs remain challenging to fit and statistical inference such as hypothesis testing remains difficult. Available software allowing autocorrelated random effects have various limitations, in terms of range of models allowed, computation limits, dependence on user decisions, and criteria of fit. One of the common practices is to use variants of penalized quasi‐likelihood approximations (PQL, Breslow and Clayton 1993; summarized later in this paper), which have been implemented in the GLIMMIX procedure in SAS or in the glmmPQL procedure in R. The use of the latter procedure for spatial analyses rests on a largely undocumented trick (Dormann et al. 2007). Other algorithms are discussed in the literature, such as Markov chain Monte Carlo methods (Diggle and Ribeiro 2007), but it is difficult to fully automate their application and, perhaps as a result, their performance has not been systematically investigated.

In the ecological and evolutionary literature, a commonly used alternative approach for testing the effect of a variable in the presence of spatial autocorrelation is the partial Mantel test. Several variants of this methodology have been described, but it typically first considers a regression of a distance matrix of the response variance to a geographic distance matrix, then uses the residuals of this first regression in a second regression to some function of the environmental variable. Oden and Sokal's (1992) simulation study first pointed problems with such approaches. Despite additional criticisms (Raufaste and Rousset 2001, Rousset 2002), the approach keep being used, and defended (Legendre and Fortin 2010, Appendix 3). The simulation study of Guillot and Rousset (2013) show that all variants discussed by Legendre and Fortin (2010) fail, and can produce a high rate of spurious significant results. Indeed all of these methods are subject to an earlier criticism which rests only on the distribution of samples generated by permutation (Raufaste and Rousset 2001), rather than on the nature of different test statistics as discussed in Legendre and Fortin (2010). What this debate shows is that partial Mantel tests will keep being used despite their weaknesses, as long as no easy and broadly applicable alternative is available. Providing an alternative to these tests can be viewed as part of the broader problem of estimating and constructing valid and efficient (likelihood‐based) confidence intervals for fixed effects in a GLMM.

In this work, we have developed new tools to address these issues. A package, spaMM, has been developed to fit spatial GLMMs in R (R Core Team). This is a standard R package, i.e. free software running on all major operating systems, including a documentation with examples based on included data sets. This package uses classical Laplace approximations for the likelihood, and the basic model for spatial correlation is the Matérn model, which encompasses the widely used but more restrictive exponential and gaussian correlation models.

In a GLMM, confidence intervals for parameters as well as tests of given values can be deduced from likelihood ratios. The validity of both types of inferences is assessed by checking the distribution of the likelihood ratio p‐values. We therefore use our new procedures to assess the performance of likelihood ratio tests and of their PQL counterparts, for both linear mixed models and for binomial and Poisson GLMMs which are relevant for count data in ecological studies. In particular, we reconsider the problem of testing fixed effects in simulations conditions where small sample bias could be expected, as well as conditions closely matching two actual studies. Although likelihood ratio (LR) tests tend to be anticonservative, we found generally good performance in the simple inferences we considered. To correct for small‐sample bias of likelihood ratio tests, we will apply a parametric bootstrap approach that requires only a small number of bootstrap replicates (as little as 100). Together, these different methods provide reliable inferences in the presence of spatial autocorrelation. For binary data, we unexpectedly found that a PQL‐based procedure could perform better than other approximations to the likelihood.

Methods

Estimation and inference

Spatial GLMMs

We consider GLMMs with spatially correlated random effects. For example, we consider observed frequencies of one genotype in different spatial locations i. Such data can be fitted by a Binomial GLMM with canonical logit link, wherein for each location i the data are fitted by a Binomial(ni,pi) where ni is the sample size in location i and
urn:x-wiley:09067590:media:ecog566-math-0001(1)
where xi are observed values of predictor variables in spatial positions i, β is a vector of associated fixed effect parameters, and the bis are random effects in different spatial positions i. Likewise, in a Poisson GLMM with canonical log link, the data are counts whose expectation ci is of the form
urn:x-wiley:09067590:media:ecog566-math-0002(2)

Standard accounts of GLMMs often also include GLMMs with Gamma‐distributed residual error, and other link functions such as the complementary log–log link for binomial data. Such cases are included in our procedures, but will not be discussed in this paper.

In the general formulation of GLMMs, the bis are assumed Gaussian with zero mean and any covariance matrix among them can be considered. The vector b of bi values is usually represented as b = Zv, where v is a vector of independent Gaussian deviates, and Z is a matrix which is either known or a function of some parameters to be estimated. This representation holds for spatial models (Breslow and Clayton 1993, Lee and Nelder 2001b) because any multivariate Gaussian distribution with marginal variance λ can be represented as the distri bution of Zv for a vector v of independently distributed Gaussian deviates with zero mean and variance λ. Its covariance matrix is then λZZ (where denotes transpose), which implies that Z can be obtained as the Cholesky factor of the correlation matrix, or as the matrix square root for symmetric semi‐positive definite matrices (Golub and van Loan 1996, pp. 143, 149).

In an elementary linear mixed model, there are two dispersion parameters, the variance λ of the bis, and the variance φ of the residual error, and the Z matrix is known and described as the design matrix of the random effects. In spatial models, we distinguish three types of parameters, the previous fixed effect and dispersion parameters, and the correlation parameters controlling the correlations between the bi in different locations. The correlation parameters affect the value of the Z matrix, which is no longer assumed constant in the process of fitting the model to the data, but is still commonly described as the design matrix of random effects. The xis may also be realizations of a spatially correlated process; however, all inferences are conditional on the realized values of the design matrix X, that is the set of xis for all positions (Davison 2003, p. 648; Cox 2006, p. 46), and therefore the GLMM analysis makes no assumption whether the elements of X are conceived as correlated random variables or not.

Approximation of the likelihood

In mixed models, the likelihood is actually the marginal likelihood, integrated over the distribution of random effects. This is often difficult to evaluate, and various approximations have been developed (Breslow and Clayton 1993, Demidenko 2004, Lee et al. 2006). Several of them can be formulated in terms of the h‐likelihood (Lee and Nelder, 1996, 2001a). The h‐likelihood is the sum of the log likelihood of the data as function of the linear predictor (i.e. of Xβ+ Zv in the above examples), and of the log likelihood of random effects values v= (υi) under the assumed distribution of random effects:
urn:x-wiley:09067590:media:ecog566-math-0003(3)
where ℓ denotes log likelihood, which may be computed either as log probability or as log probability density. The marginal likelihood for parameters (β,λ,φ) is the integral of exp(h) over the distribution of random effects, and this is approximated by a Laplace approximation as
urn:x-wiley:09067590:media:ecog566-math-0004(4)

where the inferred random effects urn:x-wiley:09067590:ecog566:ecog566-math-0101 are obtained by maximizing the h‐likelihood with respect to v, and H(h,v) is the Hessian matrix of the h likelihood with respect to the random effects, i.e. the matrix with ijth element −2h/∂υi∂υj; and |.| denotes the absolute value of the matrix determinant (Demidenko 2004, p. 12). H(h,v) can be expressed in terms of the design matrix Z, of the random effect variance, and of GLM ‘weights’ that depend on the value of the linear predictor (as function of β and v) and the link function (McCullagh and Nelder 1989, p. 40). As pν(h) is an approximation for the marginal log‐likelihood, likelihood ratio tests of fixed effects can be constructed from it.

PQL (Breslow and Clayton 1993), which estimates fixed effects by maximization of h rather than pν(h) (Lee and Nelder 2001a, Demidenko 2004, McCulloch et al. 2008), is usually a less accurate approximation of the likelihood, up to the point where it has been considered ‘not truly an approximation to the likelihood function’ and not allowing the use of likelihood ratio tests (Pinheiro and Chao 2006). Even the above Laplace approximation may fail for binary data, for which a second‐order correction has been proposed (Noh and Lee 2007).

Inference

Efficient methods to compute likelihoods and to fit models to data may not be enough. Indeed, testing a fixed effect in a GLMM is a source of persistent concerns for practitioners. For linear mixed models, both likelihood ratio tests and approximate F and t tests based on effective degrees of freedom have been criticized (Pinheiro and Bates 2000, section 2.4.2, Baayen et al. 2008, Bolker et al. 2009, p. 132). Baayen et al. (2008) suggest an MCMC approach, but it is not fully developed and the little simulation results available suggest it is conservative (their Table 4 and 5). A reasonably fast and more widely applicable method is required.

A LR chi‐square statistic with n degrees of freedom should have expected value n, but for finite samples, its expected value m will differ (as already occurs in linear models without random effects). However, the LR test can often be corrected in a conceptually very simple way, by multiplying the LR statistic by n/m: an accurate correction of the distribution of the LR statistic can thus be derived from consideration of its mean only (Bartlett 1937). In practice m may be very difficult to approximate analytically, but it can be estimated by a bootstrap approach, an approach that is investigated below. This provides an effective correction of LR test that is faster than a bootstrap assessment of the distribution of p‐values. Although this method is not new (Rocke 1989, Rayner 1990), it seems to have been overlooked in practice. For non‐spatial models, Pinheiro and Bates (2000, p. 88) used a simple design to illustrate the biases of LR tests in linear mixed models, and Fig. A1 in the Supplementary material Appendix A demonstrates the effectiveness of the bootstrap correction in this case.

In the following, we will compare two variants of the above methods, denoted ML and PQL/L. In ML, all parameters were estimated by maximization of pν. In PQL/L, considered for Poisson and binomial GLMMs (including the binary case), β is estimated by maximization of h as in standard PQL, and all dispersion and correlation parameters are estimated by generic numerical maxi mization of pν. Reasons for these choices and further alternatives are discussed in the Supplementary material Appendix B.

Implementation details

To allow the estimation of correlation parameters and the investigation of variants of the estimation method, we developed the new package spaMM. It is available from the Comprehensive R Archive Network (CRAN). It is based on the iteratively reweighted least squares algorithm (Demidenko 2004, Lee et al. 2006, McCulloch et al. 2008 for background) for estimation of β, with the gradient and Hessian matrix computed as described in Noh and Lee (2007) and Lee and Lee (2012). We also implemented a Levenberg–Marquardt variant (Nocedal and Wright 1999, Madsen et al. 2004) of this algorithm. Dispersion parameters were estimated using leverages corrected as in Lee and Nelder (2001a). Computation of the corrected AIC of Ha et al. (2007) is also included. Beyond the simulations reported below, the code was checked by comparison with other R packages, the lme4 package for non spatial linear mixed models (Bates et al. 2012), the HGLMMM package (Molas and Lesaffre 2011) for a wide class of non‐spatial mixed models, and the hglm package (Rönnegård et al. 2010) which can fit models with given correlation matrix, based on the extended quasi‐likelihood method (Lee and Nelder 1996).

Spatial correlation model

We assume that the correlation between random effects υi at spatial distance d is of the form Mνd) where ρ is a spatial scale parameter and Mνd) is the Matérn correlation family, which can be written as:
urn:x-wiley:09067590:media:ecog566-math-0005(5)

where Kν is the Bessel function of second kind and order ν, and ν > 0 is the ‘smoothness’ parameter (the higher ν is, the smoother are the realized surfaces at a small scale). The Matérn family is appropriate to fit autocorrelated processes with more or less rugged realizations, and is the most useful correlation model for a wide range of applications (Stein 1999, see also Minasny and McBratney 2005, Hoeting et al. 2006, Diggle and Ribeiro 2007). It includes the commonly used exponential and squared exponential (or ‘Gaussian’) correlation functions as special cases (for ν= 0.5 and ν→∞, respectively). Both ρ and ν were estimated.

Autoregressive models, either ‘conditional’ (CAR) or ‘simultaneous’ (SAR) have been even more widely considered (Dormann et al. 2007; and the WinBUGS software, Lunn et al. 2000) because they lead to simpler algebra, and in particular facilitate the application of fast sparse matrix methods. But autoregressive models have notable drawbacks as models of spatial autocorrelation (Wall 2004, Martellosio 2012), and will not be discussed here, although a CAR has been implemented in the spaMM package. We also implemented the Matérn correlation function in a form suitable for use with alternative procedures in R such as nlm and glmmPQL, but encountered several problems with the last one, as will be shown.

Simulation study

By definition, the distribution of p‐values under a null hypothesis should be uniform. Simulations were performed to check this property in small samples, for idealized or more realistic scenarios of the effect of a variable in a spatial landscape.

In both data simulation and analysis, the Matérn correlation model is used. We assume linear predictors of the form ηi=α+βxii, including the effect of an ‘environmental’ variable xi, and the random effect υi drawn from the multivariate Gaussian distribution with variance λ and correlations Mνd). The data are simulated for the same model, but with β= 0. In the Gaussian linear mixed model, the response variable is ηi+ei, where the residual error ei is Gaussian with variance φ which is also estimated. In the binomial GLMM with logit link (Eq. 1), the residual error is that of Binomial sampling. In the Poisson GLMM with log link (Eq. 2), the residual error is that of Poisson sampling.

For each set of parameters described below, 1000 samples were analyzed, each being independent in terms of the realized random effects and of the spatial location of samples. All results are without bootstrap correction, unless mentioned otherwise.

Default simulation design

For simplicity, an identical default set of spatial parameter values was considered for binomial, Poisson and Gaussian models: the smoothness parameter ν was either 0.5 (exponential correlation) or 4 (closer to Gaussian correlation). ρ was set to 10, and for each dataset, ns= 40 locations (indexed as i= 1,…,40) were sampled at random pairs of geographical coordinates each drawn from independent Gaussian distributions with standard deviation (‘spatial spread’σsp) 0.2 or 0.6. It is equivalent to vary ρ or to vary this standard deviation, so only the latter was varied. Then an environmental value xi is assigned to each location as a simple sequence, xi= (1,…,ns)/ns.

In the Gaussian case, the α value does not affect performance, and the residual variance was set to φ= 0.1. The variance of random effects was either λ= 0.1 or 2.5, representing excess relative variance (‘over‐dispersion’) values of 1 or 25 relative to the residual error. In the Poisson model the parameters were chosen so as to achieve a similar over‐dispersion. The marginal distribution of the response is Poisson‐lognormal, with mean exp(μf+λ/2) given the fixed term μf of the linear predictor, and second factorial moment exp[2(μf+λ)]. From this, the over‐dispersion relative to the Poisson variance μf is ≈ 1 for μf= 15 and λ= 0.06, and ≈ 25 for μf= 10 and λ= 0.763, in both cases with marginal mean ≈ 15. In the binomial case, there is no simple expression for the moments of the marginal (binomial logit‐normal) distribution. For small λ and large binomial sample size N, a Taylor series approxi mation (Coull and Agresti 2000) suggests that the over‐dispersion is close to that for the Gaussian model with the same λ, so λ= 0.1 or 2.5 was considered again and the resulting over‐dispersion was estimated from the simulations. Given the other assumed values μf=−1 (i.e. expected frequency pf= 1/4) and binomial sample size N= 40 in each location, the observed over‐dispersion relative to the binomial variance Npf(1 −pf) was ≈ 1 and 15, respectively.

Additional simulations were also run for binary data, i.e. binomial data with one sampled individual sampled per site, which describes presence/absence data. For binary data, the number of sites was increased to 100 (still a very small total sample size). Other parameters were set as above. Binary samples were checked for ‘separation’ (finiteness of the fixed effect ML estimates) using the algorithms implemented in the safeBinaryRegression package (Konis 2009). Further, samples with fewer than 10 observations of either type were ignored as non‐informative.

Real‐life sampling designs

In addition, simulations were run in two settings matching those of two real‐life applications, as described below.

Mueller et al. (2011) have searched for polymorphisms associated with migration behaviour in the European blackcap Sylvia atricapilla. Their best candidate is the allele size polymorphism at the ADCYAP1 locus encoding a neuropeptide, the adenylate cyclase‐activating polypeptide 1. To take into account correlations generated by gene flow, they used partial Mantel tests between per‐population mean allele size and a score for migratory behaviour, with 14 populations. Here these data have been reanalyzed as a linear mixed model, and a simulation study was performed, matching the spatial positions of the original samples, the values of the explanatory variable (mean allele size), and dispersion and correlation parameters close to estimates from the data (ν= 0.63, ρ= 0.055, λ= 0.55, φ= 0.0003). It should be clear that 14 autocorrelated data points provide very limited information for estimating 5 parameters under the null model (one fixed effect parameter in addition to ν, φ, λ, and φ), in which case small sample biases are expected. This case was therefore used to illustrate the efficiency of the bootstrap procedure.

The epidemiological study of Diggle et al. (2007) provides another realistic design, already used by Guillot and Rousset (2013) to illustrate the performance of partial Mantel tests. In this case, a binomial GLMM was applied to determine environmental features (altitude and vegetation features) affecting the prevalence of infection by the filarial nematode Loa loa involved in onchocerciasis in villages in Cameroon. Here the samples have been drawn assuming the reported estimates as in the original study: ρ= 1/0.7 and ν= 0.5, a superset of 197 locations, and the corresponding altitude values which are here taken as the explanatory variable which effect is tested.

Realized spatial correlations in the simulations

The simulations should cover a wide range of realistic levels of spatial correlation. The migration gene example represents a case of strong correlation over the landscape (r= 0.75 on average between a sampled position and its closest neighbour, average r= 0.36 over the landscape). The Loa loa prevalence example represents a case of stronger correlation among such neighbours and more moderate autocorrelation overall (r= 0.90 and r= 0.13, respectively). Such autocorrelation is large enough to substantially impact the performance of partial Mantel tests of fixed effects (Guillot and Rousset 2013). Our simulation study covers a wider set of autocorrelation situations, as shown in Fig. 1.

image

Cumulative distributions of spatial correlations of random effects for the different simulation conditions.

Bootstrap estimation of LR bias

For the bootstrap estimation of the LR bias, for each sample analyzed, 100 new samples are simulated under the null hypothesis, with estimated parameters under the null model, and with the same spatial locations as the original sample. The mean likelihood ratio for these new samples is then computed and used to correct the original likelihood ratio, independently for the 1000 original samples analyzed.

Results

Default simulation design

The Supplementary material Appendix C and D shows the distribution of p‐values for all simulations (Fig. C1 and D1). As convenient summaries, we report here the proportion of significant tests at the conventional 0.05 and 0.01 levels, and the average value of the likelihood ratio chi‐square statistic, which expectation should equal the number of degrees of freedom, i.e. 1 in all cases. If the testing procedure is exact, for 1000 simulation replicates the observed values of these summaries are expected to fall with probability ≈ 0.95 in the intervals 0.037–0.063, 0.005–0.016, and 0.914–1.09, respectively.

Overall it was found that p‐values of LR tests derived from ML fits were close to uniformly distributed (Supplementary material). For the default set of parameters, the main deviations for low p values were observed in the Gaussian case (Table 1). In this case, LR tests are anti‐conservative, a known result even for fixed‐effect linear models, and which comes from the approximate nature of likelihood ratio tests in small samples. The same trend occurs to a much lesser extent in the non‐Gaussian cases. This is perhaps best summarized by the mean value of the LR chi‐square statistic, which is 1.21 for Gaussian cases, but only 1.06 and 1.03 for Poisson and binary cases, respectively.

Table 1. Performance of likelihood ratio tests. In each case, the table shows the proportion of significant tests at the conventional 0.05 and 0.01 levels, and the average value LR of the likelihood ratio chi‐square statistic. For λ, ‘low’ and ‘high’ values are respectively 0.1 and 2.5, except for the Poisson case where they are 0.06 and 0.763 (see main text). For binary samples, the λ estimates were constrained below 5 (see main text)
Gaussian Poisson Binomial Binary (urn:x-wiley:09067590:ecog566:ecog566-math-0102 < 5)
σsp ν λ < 0.05 < 0.01 LR < 0.05 < 0.01 LR < 0.05 < 0.01 LR < 0.05 < 0.01 LR
0.2 0.5 low 0.072 0.019 1.161 0.051 0.007 1.05 0.054 0.008 1.062 0.052 0.011 0.995
high 0.074 0.022 1.241 0.065 0.018 1.172 0.072 0.011 1.228 0.052 0.013 1.066
4 low 0.078 0.017 1.195 0.051 0.013 1.044 0.052 0.011 0.973 0.058 0.016 1.025
high 0.059 0.013 1.138 0.048 0.01 0.984 0.041 0.008 0.952 0.055 0.013 1.009
0.6 0.5 low 0.073 0.018 1.182 0.056 0.011 1.079 0.049 0.015 1.014 0.06 0.016 1.045
high 0.069 0.018 1.15 0.074 0.015 1.179 0.053 0.009 1.098 0.044 0.011 0.955
4 low 0.077 0.019 1.22 0.039 0.01 0.949 0.047 0.006 0.996 0.057 0.017 1.037
high 0.091 0.026 1.377 0.048 0.014 1.042 0.043 0.006 0.948 0.051 0.01 1.016

Binary data

The analysis of binary data was less straightforward. First, binary data are generally considered the most challenging setting for approximations of likelihood. In the present application, there is only one draw for each level of the random effects, in contrast to other discussions of binary data (Breslow and Lin 1995, Pinheiro and Chao 2006, Noh and Lee 2007) where there are at least two such draws (binary matched pairs). The use of PQL when the number of draws is low has been particularly criticized (McCulloch et al. 2008), although its performance is expected to improve quickly with the number of draws. For binary matched pairs, it has also been found that the pν(h) approximation of the likelihood could be improved by a second‐order Laplace approximation (Noh and Lee 2007). We considered all three methods for the estimation of fixed effects (standard Laplace approximation for ML, second‐order Laplace approximation, and PQL) and unexpectedly found that the PQL variant, PQL/L, performed best for binary data (Table 2). PQL/L was similar to ML in the other cases, with overall slightly inflated type‐1 error, and a (usually small) fraction of negative LR, which is not unexpected given that the fitting procedure involves maximization of two distinct functions for distinct sets of parameters (see Methods and Supplementary material).

Table 2. Performance of PQL/L likelihood ratio tests. See Table 1 legend for details, except that λ estimates were not constrained below 5 for binary samples. The samples analyzed are exactly the same in both tables
Poisson Binomial Binary
σsp ν λ < 0.05 < 0.01 LR < 0.05 < 0.01 LR < 0.05 < 0.01 LR
0.2 0.5 low 0.053 0.008 1.062 0.054 0.008 1.064 0.057 0.01 1.047
high 0.066 0.018 1.185 0.077 0.012 1.243 0.06 0.013 1.089
4 low 0.054 0.013 1.042 0.052 0.011 0.973 0.067 0.016 1.074
high 0.053 0.01 0.995 0.041 0.008 0.955 0.064 0.018 1.086
0.6 0.5 low 0.056 0.011 1.088 0.05 0.015 1.018 0.067 0.016 1.083
high 0.076 0.015 1.192 0.056 0.01 1.109 0.051 0.009 0.967
4 low 0.04 0.009 0.952 0.047 0.006 0.998 0.06 0.014 1.084
high 0.053 0.014 1.062 0.043 0.007 0.958 0.046 0.009 1.023

With only one draw, all methods might be expected to perform poorly. However, in contrast to the PQL approximation, the pν(h) approximation for likelihood may be inaccurately large for large λ, and the second‐order Laplace approximation even more so. For example, consider binary data in 6 locations, 5 ‘positive’ and one ‘negative’, for spatially independent random effects with linear predictor α+υi (i= 1,…,6). The log‐likelihood (directly computed by numerical integration) is maximized for α≈ 2, λ≈ 1.4; h‐likelihood is maximized for α≈ 1.7, λ≈ 0; pν(h) for α≈ 11.9, λ≈ exp(7.7), and the second‐order approxi mation appears to increase indefinitely, linearly with log(λ) for α≈ 0.

Although the joint estimates actually depend on the distinct objective functions used to estimate β and λ, this example predicts the observed trends. The inaccuracies of the Laplace approximations lead to a high frequency of ML fits diverging to very large λ values, often with no inferred spatial correlation, and to very inaccurate LR statistics. The PQL/L fits were comparatively much better behaved (Table 2). These problems may largely disappear in practice as soon as soon as two draws are made in each spatial location (Fig. F1 in Supplementary material Appendix F). The ML fits may still be useful under some conditions, as the distributions of p‐values for samples that did not exhibit such a divergence were uniform. Likewise, the distribution was uniform if λ estimates were constrained as < 5 (Table 1). However, this constraint expectedly raises other problems; for example if the true λ= 100 (with other parameters as in the fourth row of the table), the test of fixed effects appears conservative, the mean likelihood ratio chi‐square statistic being 0.66.

Effects of bootstrap correction

To illustrate the effect of the bootstrap correction, we considered bad‐looking Poisson and binomial cases from the Tables. We further reduced the samples sizes, and performed PQL/L analyses, to accentuate small sample bias (with actually limited effect). The results are shown in Fig. 2 and confirms the effectiveness of the correction. The Gaussian case is illustrated in the next section.

image

Effects of bootstrap correction. Left: same parameters as in Poisson case, sixth row in Table 1 and 2, but with only 20 sampled locations. Right: same parameters as in binomial case, second row in Table 1 and 2, but with only 20 sampled locations with 20 draws per location.

Ad hoc sampling designs and glmmPQL performance

Simulations based on the migration gene study design exhibit a strong bias of the likelihood ratio test, as expected from such small samples. This case was used to check the efficiency of the bootstrap (with only 100 replicates) in correcting such a bias (Fig. 3 left).

image

Distributions of p‐values for slope (β) estimates from simulations based on real sampling designs. Left: results of uncorrected and bootstrap‐corrected likelihood ratio tests for data simulated according to the migration gene study design; right: results of ML and PQL methods for data simulated according to the onchocerciasis study design. In contrast to glmmPQL, PQL/L and ML result are barely distinguishable.

For LMMs, the data sets can also be analyzed using the nlm procedure in R through a syntactic trick as described in Dormann et al. (2007, Appendix). However, this de facto constrains the analysis to models without residual error; the results are otherwise similar to those of the ML method (see Supplementary material Appendix G for details).

In simulations based on the onchorcerciasis study design, the likelihood ratio tests based on either ML or PQL/L exhibited little bias, while analyses based on glmmPQL and the same syntactic trick are strongly anticonservative (Fig. 3 right). These comparisons are based on the t‐test in glmmPQL, as this procedure did not provide likelihood values. Supplementary material Appendix G also presents the distributions of estimates by the different methods for all simulated data sets, highlighting further problems with glmmPQL.

Discussion

In this work we have implemented and assessed methods for fitting GLMMs, for the poorly implemented case of spatial data. Our simulations confirm that these methods allow inferences of environmental and genetic effects in spatially correlated landscapes. Although we have focused here on inferences about fixed effects, the Supplementary material shows that the new procedures also provide better estimates of the spatial autocorrelation parameters (Supplementary material Appendix G, Fig. G1 and G2) and that glmmPQL may not provide useful estimates of the variance of the autocorrelated random effects.

The results confirm that some testing biases, leading to too narrow confidence intervals, are observed for small samples, in particular in the Gaussian case, in which case a bootstrap correction is recommended. This correction is of more general interest as it should be fast, and easy to perform with alternative software, in non‐spatial models. Otherwise, the testing biases are much smaller than those that can be observed for partial Mantel tests. For example, for the design based on the onchocerciasis study, Guillot and Rousset (2013) found that the error rate of the latter is 27.5% at the nominal level 5%. The spaMM package therefore allows for more reliable inferences. Another approach, where standard R software for linear mixed models is applied by specifying a dummy random effect, may sometimes yield good results but has several drawbacks. In linear mixed models, ML fits using the lme procedure effectively constrain the residual error to zero, and were also found to diverge in a large proportion of simulations. Likewise, glmmPQL should not be used to analyze non‐Gaussian spatial data, as it performs substantially worse than our implementation for spatial data. This may not be a problem with glmmPQL per se, which has been shown to perform more satisfactorily in other applications (Hamel et al. 2012), but may stem from the fact that it has to be used in a non‐standard way, to our knowledge not recommended by its authors, for the analysis of spatial data.

The main approximations used in this work have already been checked and compared to previous proposals such as penalized quasi‐likelihood or Markov‐chain simulation methods, mainly in terms of bias and variance of estimators for various specifications of the fixed and random effects (Lee et al. 2006, pp. 190–192, Noh et al. 2006, Pinheiro and Chao 2006, Jang et al. 2007, Noh and Lee 2007, Lee and Lee 2012). With the exception of the PQL/L results for binary data, the simulation results can be seen as a check of well‐established, though approximate, likelihood methods for GLMMs. However, there does not appear to be comparable simulations for spatially correlated models in the literature. Ignoring autocorrelation may have little effect on the bias of estimates of fixed effects but should result in underestimates of the variance and too narrow confidence intervals. Thus, our assessment of the properties of likelihood ratio tests is much more informative than simple assessment of bias of estimators.

Comparison of models with or without spatial correlations is feasible within the present framework, as the model with spatial correlation includes the model without spatial correlation as a limit case (when the spatial scale parameter ρ become very large). However, for inference of fixed effects, it appears better to always include spatial autocorrelation in the analysis, even if autocorrelation appears non‐significant. In particular, a non significant autocorrelation can arise in a real data set because few localities are sampled, but this does not necessarily mean that the autocorrelation does not impact inference of fixed effects, because the statistical information about a fixed effect in this data set can decrease with increasing assumed level of autocorrelation. Such cases can be detected by comparing confidence intervals for fixed effect under (say) the fitted non‐zero autocorrelation, and in a model without autocorrelation. However, the proper interval for fixed effects is not the one given by the first of these two computations. Rather, it is given by the profile likelihood ratios, which are designed to take into account uncertainty in nuisance parameters.

We have considered the Matérn correlation model for a first implementation, using generic matrix methods applicable to any correlation matrix. A well‐known issue for mixed models is that the computation time of fitting algorithms involving such matrix computations increases sharply with sample size, here with the number of sampled locations. For example, tests of the effect of climate variables on single‐nucleotide polymorphisms in Arabidopsis thaliana took nearly 10 CPU hours on 2 GHz processors on average (52 tests) when a large data set of 948 locations (Hancock et al. 2011) was considered. This can probably be shortened by first analyzing subsets of the data to define good starting values for the full analysis, as well as by more or less ad hoc optimization of the code, but not up to the point where interactive analyses can be considered. Sparse matrix techniques have often been used to analyze large data sets, or more generally to speed up computations (e.g. the lme4 package, Bates et al. 2012). Autoregressive models (Methods) have also been considered for the same reason, but as the spatial correlation matrices considered in this work are not inherently sparse, the feasibility of sparse matrix approximations and their ultimate impact on statistical inference is not obvious.

For binary data, Laplace approximations have clear weaknesses, in particular overestimating the likelihood of high variance of random effects. What is usually the crudest approximation, penalized quasi‐likelihood, was here better behaved, and should be used at least whenever divergence of λ estimates is observed in ML fits. This may be sufficient for the simple inferences problems considered in this work, but not more generally. Other approximations to marginal and conditional likelihoods, and algorithms to fit models, could be considered. In particular, Diggle et al. (2003) developed MCMC methods for fitting the spatial GLMMs considered in this study. These methods might perform well by the present criteria when properly used, but this may be difficult to assess insofar as they require substantial user intervention on each data set (Diggle and Ribeiro 2007, p. 175). More recently, Rue et al. (2009) developed integrated nested Laplace integration (INLA), which may give results similar to those used in this work (see Lee in discussion of Rue et al. 2009). Several other techniques are discussed in the literature (reviewed by Demidenko 2004, McCulloch et al. 2008), for mixed models distinct from the present one, and distributions of p‐values (or coverage of confidence intervals) are rarely considered, so it is again difficult to anticipate how well they would perform.

A prominent question in the recent literature is how to detect the effect of environmental features, rather than simply geographical distance, on genetic structure (landscape genetics, Guillot et al. 2009, Storfer et al. 2010), and similar questions arise in species distribution modelling (Algar et al. 2013). In a GLMM framework, assessing landscape features on gene flow or individual dispersal is equivalent to testing whether the correlation matrix of random effects is a function only of distance or of other effects. This can be tested by comparing restricted likelihood values for models with different structures of the correlation matrix, provided that effects of landscape features on correlations among allele frequencies in different locations can be related to correlations among underlying Gaussian random effects in a GLMM.

In summary, inference problems in spatially autocorrelated landscapes can be addressed by fitting spatial generalized linear mixed models. Spatial analyses are recommended even in cases when spatial autocorrelation appears non‐significant because of insufficient power. However, software implementations have been limited in various respects, and this approach is often ignored in ecological and evolutionary studies. We have shown that valid inferences of fixed effects can be performed in small samples, using Laplace approximation or even penalized quasi‐likelihood approaches. A simple bootstrap method is recommended for Poisson and binomial data sampled in fewer than 20 locations, and more generally for Gaussian data. The present work makes all these tasks more practical, and provides more reliable inferences of both fixed and random effect parameters than previously available ones (in particular, glmmPQL), which cannot be recommended in a spatial context.

Acknowledgements

This work was supported by an exploratory program (PEPS) ‘Comprendre les maladies émergentes et les épidémies’. We thank N. Yoccoz for helpful comments on the manuscript, Y. Lee, L. Rönnegård, M. Alam, and M. Molas for dis cussion of h‐likelihood methods and software, A. Courtiol for stimulating discussion about other R packages, and J.‐M. Marin for further discussion and help in tracking some references. Most computations were performed on the ISEM computing cluster platform. We thank R. Dernat for assistance in using this cluster.

    Supplementary material (Appendix ECOG‐00566 at <www.oikosoffice.lu.se/appendix>). Appendix A–G and R scripts.

      Number of times cited according to CrossRef: 75

      • Multiresolution spatial generalized linear mixed model for integrating multi-fidelity spatial count data without common identifiers between data sources, Spatial Statistics, 10.1016/j.spasta.2020.100467, (100467), (2020).
      • Finding hotspots: development of an adaptive spatial sampling approach, Scientific Reports, 10.1038/s41598-020-67666-3, 10, 1, (2020).
      • Urban aliens and threatened near-naturals: Land-cover affects the species richness of alien- and threatened species in an urban-rural setting, Scientific Reports, 10.1038/s41598-020-65459-2, 10, 1, (2020).
      • Pollen defenses negatively impact foraging and fitness in a generalist bee (Bombus impatiens: Apidae), Scientific Reports, 10.1038/s41598-020-58274-2, 10, 1, (2020).
      • Large‐scale variation in birth timing and synchrony of a large herbivore along the latitudinal and altitudinal gradients, Journal of Animal Ecology, 10.1111/1365-2656.13251, 89, 8, (1906-1917), (2020).
      • Impacts of an invasive plant on bird communities differ along a habitat gradient, Global Ecology and Conservation, 10.1016/j.gecco.2020.e01150, (e01150), (2020).
      • Life in the canopy: community trait assessments reveal substantial functional diversity among fern epiphytes, New Phytologist, 10.1111/nph.16607, 227, 6, (1885-1899), (2020).
      • Movement tortuosity and speed reveal the trade-offs of crop raiding for African elephants, Animal Behaviour, 10.1016/j.anbehav.2020.08.009, 168, (97), (2020).
      • Quantifying the effectiveness of habitat management to counter local extinction: A case-study on capercaillie, Forest Ecology and Management, 10.1016/j.foreco.2020.118379, 474, (118379), (2020).
      • Perils and pitfalls of mixed-effects regression models in biology, PeerJ, 10.7717/peerj.9522, 8, (e9522), (2020).
      • Scaling human sociopolitical complexity, PLOS ONE, 10.1371/journal.pone.0234615, 15, 7, (e0234615), (2020).
      • Rainfall and nest site competition delay Mountain Bluebird and Tree Swallow breeding but do not impact productivity, The Auk, 10.1093/auk/ukaa006, (2020).
      • Statistical analysis of spatial expression patterns for spatially resolved transcriptomic studies, Nature Methods, 10.1038/s41592-019-0701-7, (2020).
      • Genome‐wide epigenetic isolation by environment in a widespread Anolis lizard, Molecular Ecology, 10.1111/mec.15301, 29, 1, (40-55), (2019).
      • Meningococcal carriage by age in the African meningitis belt: a systematic review and meta-analysis, Epidemiology and Infection, 10.1017/S0950268819001134, 147, (2019).
      • Investigating dependence between frequency and severity via simple generalized linear models, Journal of the Korean Statistical Society, 10.1016/j.jkss.2018.07.003, 48, 1, (13-28), (2019).
      • Isoscape Computation and Inference of Spatial Origins With Mixed Models Using the R package IsoriX, Tracking Animal Migration with Stable Isotopes, 10.1016/B978-0-12-814723-8.00009-X, (207-236), (2019).
      • Interacting maternal and spatial cues influence natal – dispersal out of social groups, Oikos, 10.1111/oik.06531, 128, 12, (1793-1804), (2019).
      • Vegetation classification enables inferring mesoscale spatial variation in plant invasibility, Invasive Plant Science and Management, 10.1017/inp.2019.23, 12, 03, (161-168), (2019).
      • Spillover effects of railway and road on CO2 emission in China: a spatiotemporal analysis, Journal of Cleaner Production, 10.1016/j.jclepro.2019.06.278, (2019).
      • Over‐dispersed count data in crop and agronomy research, Journal of Agronomy and Crop Science, 10.1111/jac.12333, 205, 4, (414-421), (2019).
      • Female–female aggression is linked to food defence in a poison frog, Ethology, 10.1111/eth.12848, 125, 4, (222-231), (2019).
      • Adaptive responses of animals to climate change are most likely insufficient, Nature Communications, 10.1038/s41467-019-10924-4, 10, 1, (2019).
      • Spatiotemporal dynamics of fungicide resistance in the wheat pathogen Zymoseptoria tritici in France, Pest Management Science, 10.1002/ps.5360, 75, 7, (1794-1807), (2019).
      • Pollinator specialization increases with a decrease in a mass‐flowering plant in networks inferred from DNA metabarcoding, Ecology and Evolution, 10.1002/ece3.5531, 9, 24, (13650-13662), (2019).
      • Well-intentioned, but poorly implemented: Debris from coastal bamboo fences triggered mangrove decline in Thailand, Marine Pollution Bulletin, 10.1016/j.marpolbul.2019.07.055, 146, (900-907), (2019).
      • The environmental predictors of spatio-temporal variation in the breeding phenology of a passerine bird, Proceedings of the Royal Society B: Biological Sciences, 10.1098/rspb.2019.0952, 286, 1908, (20190952), (2019).
      • Influence of Floods and Growth Duration on the Productivity of Wet Grasslands of Echinochloa stagnina (Retz) P. Beauv. in an East African Floodplain, Wetlands, 10.1007/s13157-019-01148-9, (2019).
      • Tree Cover Mediates the Effect of Artificial Light on Urban Bats, Frontiers in Ecology and Evolution, 10.3389/fevo.2019.00091, 7, (2019).
      • Fragmentation of lead-free and lead-based hunting rifle bullets under real life hunting conditions in Germany, Ambio, 10.1007/s13280-019-01168-z, (2019).
      • Artificial wave breakers promote the establishment of alien aquatic plants in a shallow lake, Biological Invasions, 10.1007/s10530-019-01915-z, (2019).
      • Role of the Photorhabdus Dam methyltransferase during interactions with its invertebrate hosts, PLOS ONE, 10.1371/journal.pone.0212655, 14, 10, (e0212655), (2019).
      • Music Festival Makes Hedgehogs Move: How Individuals Cope Behaviorally in Response to Human-Induced Stressors, Animals, 10.3390/ani9070455, 9, 7, (455), (2019).
      • No effect of season on the electrocardiogram of long-eared bats (Nyctophilus gouldi) during torpor, Journal of Comparative Physiology B, 10.1007/s00360-018-1158-1, 188, 4, (695-705), (2018).
      • Accounting for Spatial Dependence in Ecological Data, Spatial Ecology and Conservation Modeling, 10.1007/978-3-030-01989-1, (169-210), (2018).
      • The NATURA 2000 database as a tool in the analysis of habitat selection at large scales: factors affecting the occurrence of pine and stone martens in Southern Europe, European Journal of Wildlife Research, 10.1007/s10344-018-1168-z, 64, 1, (2018).
      • Predation and cryptic coloration in a managed landscape, Evolutionary Ecology, 10.1007/s10682-018-9931-x, 32, 2-3, (141-157), (2018).
      • Coping with change in predation risk across space and time through complementary behavioral responses, BMC Ecology, 10.1186/s12898-018-0215-7, 18, 1, (2018).
      • Inferring spatial patterns and drivers of population divergence of Neolitsea sericea (Lauraceae), based on molecular phylogeography and landscape genomics, Molecular Phylogenetics and Evolution, 10.1016/j.ympev.2018.04.010, 126, (162-172), (2018).
      • Hierarchical decision‐making balances current and future reproductive success, Molecular Ecology, 10.1111/mec.14583, 27, 9, (2289-2301), (2018).
      • The effects of woodland habitat and biogeography on blue tit Cyanistes caeruleus territory occupancy and productivity along a 220 km transect, Ecography, 10.1111/ecog.03573, 41, 12, (1967-1978), (2018).
      • Choroidal thickness and myopia in relation to physical activity – the CHAMPS Eye Study, Acta Ophthalmologica, 10.1111/aos.13640, 96, 4, (371-378), (2018).
      • Demographic characteristics of an avian predator, Louisiana Waterthrush (Parkesia motacilla), in response to its aquatic prey in a Central Appalachian USA watershed impacted by shale gas development, PLOS ONE, 10.1371/journal.pone.0206077, 13, 11, (e0206077), (2018).
      • Geography and Environment Shape Landscape Genetics of Mediterranean Alpine Species Silene ciliata Poiret. (Caryophyllaceae), Frontiers in Plant Science, 10.3389/fpls.2018.01698, 9, (2018).
      • Social support drives female dominance in the spotted hyaena, Nature Ecology & Evolution, 10.1038/s41559-018-0718-9, (2018).
      • A brief introduction to mixed effects modelling and multi-model inference in ecology, PeerJ, 10.7717/peerj.4794, 6, (e4794), (2018).
      • Determinants and patterns of habitat use by the brown bear Ursus arctos in the French Pyrenees revealed by occupancy modelling , Oryx, 10.1017/S0030605317000321, 53, 2, (334-343), (2017).
      • Inconsistent effects of landscape heterogeneity and land-use on animal diversity in an agricultural mosaic: a multi-scale and multi-taxon investigation, Landscape Ecology, 10.1007/s10980-017-0595-7, 33, 2, (241-255), (2017).
      • Correlated Poisson models for age‐period‐cohort analysis, Statistics in Medicine, 10.1002/sim.7519, 37, 3, (405-424), (2017).
      • The Toll pathway underlies host sexual dimorphism in resistance to both Gram-negative and Gram-positive bacteria in mated Drosophila, BMC Biology, 10.1186/s12915-017-0466-3, 15, 1, (2017).
      • Mate choice intensifies motor signalling in Drosophila, Animal Behaviour, 10.1016/j.anbehav.2017.09.014, 133, (169-187), (2017).
      • The shifting phenological landscape: Within‐ and between‐species variation in leaf emergence in a mixed‐deciduous woodland, Ecology and Evolution, 10.1002/ece3.2718, 7, 4, (1135-1147), (2017).
      • Spatial patterns of Anchoveta ( Engraulis ringens ) eggs and larvae in relation to p CO 2 in the Peruvian upwelling system , Proceedings of the Royal Society B: Biological Sciences, 10.1098/rspb.2017.0509, 284, 1855, (20170509), (2017).
      • Automatic Object-Oriented, Spectral-Spatial Feature Extraction Driven by Tobler’s First Law of Geography for Very High Resolution Aerial Imagery Classification, Remote Sensing, 10.3390/rs9030285, 9, 3, (285), (2017).
      • Spontaneous mutation rate is a plastic trait associated with population density across domains of life, PLOS Biology, 10.1371/journal.pbio.2002731, 15, 8, (e2002731), (2017).
      • Global determinants of zoogeographical boundaries, Nature Ecology & Evolution, 10.1038/s41559-017-0089, 1, 4, (0089), (2017).
      • Investigating the case of human nose shape and climate adaptation, PLOS Genetics, 10.1371/journal.pgen.1006616, 13, 3, (e1006616), (2017).
      • The summary‐likelihood method and its implementation in the Infusion package, Molecular Ecology Resources, 10.1111/1755-0998.12627, 17, 1, (110-119), (2016).
      • MHC-dependent mate choice is linked to a trace-amine-associated receptor gene in a mammal, Scientific Reports, 10.1038/srep38490, 6, 1, (2016).
      • Navigating the pitfalls and promise of landscape genetics, Molecular Ecology, 10.1111/mec.13527, 25, 4, (849-863), (2016).
      • Interactions between Genetic and Ecological Effects on the Evolution of Life Cycles, The American Naturalist, 10.1086/684167, 187, 1, (19-34), (2016).
      • Reproductive isolation between populations of Iris atropurpurea is associated with ecological differentiation , Annals of Botany, 10.1093/aob/mcw139, 118, 5, (971-982), (2016).
      • Using metabarcoding to reveal and quantify plant-pollinator interactions, Scientific Reports, 10.1038/srep27282, 6, 1, (2016).
      • Comprehensive profiling of retroviral integration sites using target enrichment methods from historical koala samples without an assembled reference genome, PeerJ, 10.7717/peerj.1847, 4, (e1847), (2016).
      • Impact of enrichment conditions on cross‐species capture of fresh and degraded DNA, Molecular Ecology Resources, 10.1111/1755-0998.12420, 16, 1, (42-55), (2015).
      • A practical two‐step approach for mixed model‐based kriging, with an application to the prediction of soil organic carbon concentration, European Journal of Soil Science, 10.1111/ejss.12238, 66, 3, (548-554), (2015).
      • Dynamic Quantitative Trait Locus Analysis of Plant Phenomic Data, Trends in Plant Science, 10.1016/j.tplants.2015.08.012, 20, 12, (822-833), (2015).
      • The Non-Proliferative Nature of Ascidian Folliculogenesis as a Model of Highly Ordered Cellular Topology Distinct from Proliferative Epithelia, PLOS ONE, 10.1371/journal.pone.0126341, 10, 5, (e0126341), (2015).
      • A comparison of observation-level random effect and Beta-Binomial models for modelling overdispersion in Binomial data in ecology & evolution, PeerJ, 10.7717/peerj.1114, 3, (e1114), (2015).
      • The Influence of Prior Learning Experience on Pollinator Choice: An Experiment Using Bumblebees on Two Wild Floral Types of Antirrhinum majus, PLOS ONE, 10.1371/journal.pone.0130225, 10, 8, (e0130225), (2015).
      • Behavioral Cost & Overdominance in Anopheles gambiae, PLOS ONE, 10.1371/journal.pone.0121755, 10, 4, (e0121755), (2015).
      • Isolation by environment, Molecular Ecology, 10.1111/mec.12938, 23, 23, (5649-5662), (2014).
      • XY FEMALES DO BETTER THAN THE XX IN THE AFRICAN PYGMY MOUSE, MUS MINUTOIDES, Evolution, 10.1111/evo.12387, 68, 7, (2119-2127), (2014).
      • Testing muzzle and ploy devices to reduce predation of bees by Asian hornets, Journal of Applied Entomology, 10.1111/jen.12808, 0, 0, (undefined).
      • Importance of metapopulation dynamics to explain fish persistence in a river system, Freshwater Biology, 10.1111/fwb.13571, 0, 0, (undefined).

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.