Linear mixed models and geostatistics for designed experiments in soil science: Two entirely different methods or two sides of the same coin?

Soil scientists are accustomed to geostatistical methods and tools such as semivariograms and kriging for analysis of observational data. Such methods assume and exploit that observations are spatially correlated. Conversely, analysis of variance (ANOVA) of designed experiments assumes that observations from different experimental units are independent, an assumption that is justified based on randomization. It may be beneficial, however, to perform an ANOVA assuming a geostatistical covariance model. Also, it is increasingly common to have multiple observations per experimental unit. Simple ANOVA assuming independence of observations is not appropriate for such data. Instead, a linear mixed model accounting for correlation among observations made on the same plot is required for proper analysis. The purpose of this paper is to demonstrate the benefits of integrating geostatistical covariance structures and ANOVA procedures into a linear mixed modelling framework. Two examples from designed experiments are considered in detail, making a link between terminologies and jargon used in geostatistical analysis on the one hand and linear mixed modelling on the other hand. We provide code in R and SAS for both examples in two supporting companion documents.

Instead, a linear mixed model accounting for correlation among observations made on the same plot is required for proper analysis. The purpose of this paper is to demonstrate the benefits of integrating geostatistical covariance structures and ANOVA procedures into a linear mixed modelling framework. Two examples from designed experiments are considered in detail, making a link between terminologies and jargon used in geostatistical analysis on the one hand and linear mixed modelling on the other hand. We provide code in R and SAS for both examples in two supporting companion documents.

| INTRODUCTION
Data from observational studies (surveys) in soil science are often complex. In many cases the main challenge in statistical analysis is to account for and exploit spatial correlation among observations taken in close vicinity, which are typically more similar than observations farther apart. Soil scientists are accustomed to dealing with spatially correlated data, typically using geostatistical methods for analysis, such as the semivariograms and kriging for map interpolation (Diggle & Ribeiro, 2007;Isaacs & Srivastava, 1989;Oliver & Webster, 2014;Webster & Oliver, 2007). Conversely, designed experiments are routinely subjected to analysis of variance (ANOVA) procedures and regression techniques based on linear models, assuming that observations from different experimental units (plots, chambers, pots, etc.) are independently distributed, an assumption that is well justified if the experiment was properly randomized (Bailey, 2009).
Geostatistics on the one side and ANOVA on the other side may appear like two entirely different frameworks. For example, a soil scientist wanting to do an analysis of a designed field experiment to compare several tillage methods may wonder how to account for spatial correlation among several soil samples taken on the same plot. Remembering her geostatistics class, she turns to semivariogram modelling (Oliver & Webster, 2014) and draws up a map of soil properties for the trial, finding that there is in fact substantial spatial correlation both within and between plots. Having reached this far, she then turns back to the research question, which called for a comparison among several group means, and finds that a semivariogram or map does not provide such a comparison. The researcher then remembers how to perform a basic one-way ANOVA using her favourite statistical package but soon realizes that this method does not account for spatial correlation and in fact assumes independence between observations, which leaves her puzzled as to the best way forward.
Even though, on the face of it, geostatistics and ANOVA indeed appear like entirely different methods, they can, in fact, be integrated into a single powerful framework (Lark, Cullis, & Welham, 2006;Robinson, 1991;Stroup, 2002): linear mixed models. It is the purpose of this paper to illustrate this framework and its benefits to soil scientists and to align the jargon used in geostatistics with the terminology used for linear mixed modelling. Two examples from designed experiments serve to introduce different features of the framework.

| A FIRST EXAMPLE: A RANDOMIZED FIELD TRIAL TO STUDY THE SPATIAL AND TEMPORAL VARIATION OF ARBUSCULAR MYCORRHIZAL FUNGAL ABUNDANCE IN SOIL
Example 1 A randomized experiment was conducted in the year 2011 to study the small-scale spatio-temporal variation of arbuscular mycorrhizal fungal (AMF) communities under grassland (Buscot, Wubet, Goldmann, & Klemmer, 2018;Goldmann et al., 2020;Regan et al., 2014Regan et al., , 2015Regan et al., , 2017Regan, Berner, Boeddinghaus, Marhan, & Kandeler, 2019). The experimental area (10 × 10 m) was divided up into 30 blocks (2 × 1.67 m each), arranged in five rows and six columns as shown in Figure 1. Each block had six plots. Soil samples were taken at six dates throughout the year (5 April, 17 May, 27 June, 16 August, 5 October, 21 November). Within each block, the six sampling dates were randomly allocated to plots, so the sampling design was a randomized complete block design with dates as treatments. Each plot was split into two subplots (20 × 20 cm) to assess short-range spatial variation within dates and plots. The subplot samples are thus pseudo-replicates (Hurlbert, 1984; Box S1 in Supporting Information). The main purpose of these subplot samples was to help in estimating the semivariogram at short distances. In total, there were 360 subplots. One soil core was collected per subplot and spatially referenced using coordinates in metres from an origin situated in one corner of the trial area. AMF alpha-diversity was determined for each subplot using 18S rDNA pyrosequencing analysis (Goldmann et al., 2020). Here, we will consider the total abundance over all taxa (total observed taxonomical unit (OTU) richness). This variable was log-transformed (and then multiplied by 10) because all counts were greater than zero (minimum: 11; median: 24; maximum: 45) and diagnostic residual plots indicated that this transformation simultaneously achieved approximate normality and homogeneity of variance. One objective of the study was to assess the variation of AMF communities over time and in space. Figure 2 shows a kriged map for total AMF OTU richness at each of the six dates. Kriging was performed based on the exponential semivariance model fitted to the subplot data by weighted least squares using the number of observations per distance class (bin) of the empirical semivariances as weights (http://www.gstat. org/gstat.pdf, Table 4.2). The maps display substantial spatial heterogeneity at each date, and there is a notable evolution of AMF OTU richness over time.
An ANOVA was performed for plot means (Table 1), showing that there are significant differences among dates (F = 30.81, p < .0001), with the largest total AMF OTU richness occurring in October. The means for the 6 month in Table 2 agree well with the impression from the kriged maps in Figure 2.
The two basic analyses presented so far raise some questions. The ANOVA in Table 1 assumes that the data representing each plot are independent, justified by random allocation of treatments to plots, whereas the kriging done to generate maps in Figure 2 explicitly assumes that the sampling points are spatially correlated across the whole trial area and this correlation is actually the basis for the interpolation between sampling points that is needed to draw kriged maps. So, which of these two analyses is based on the correct assumption? Perhaps surprisingly, the answer is: both are! It will be shown in this paper that the two approaches, kriging and ANOVA, seeming so entirely different at first sight, can be reconciled and integrated into a unifying framework. Before considering that framework, we will review a few basics about spatial statistics and ANOVA.

| SOME BASICS ABOUT SPATIAL STATISTICS
In spatial datasets, if we want to estimate the overall variance in the data, we need to account for spatial correlation (i.e., points closer together will be more similar due to their proximity), unless the data are collected based on some kind of random sampling design (de Gruijter, Brus, Bierkens, & Knotters, 2006). Also, interpolation between sampling points is possible by exploiting spatial correlation. In order to model the spatial correlation between observations on a grid of points, one may estimate the semivariance. For random variables Y 1 and Y 2 the semivariance is defined as half the variance of the difference, F I G U R E 1 Sampling design for the 10 × 10 m field experiment (Example 1) across all six sampling dates (figure adapted from Goldmann et al., 2020) [Color figure can be viewed at wileyonlinelibrary.com] 1 2 var Y 1 −Y 2 ð Þ, and it is estimated by 1 2 y 1 −y 2 ð Þ 2 , where y 1 and y 2 are the two observations ("realizations") corresponding to Y 1 and Y 2 (Webster & Oliver, 2007). A plot of these semivariances against spatial distance of the two points, d 12 , is called the empirical semivariogram. A nonlinear function, the semivariance function, may be fitted to describe how the semivariance changes with distance ( Figure 3). The semivariance at zero distance, that is, the intercept of the semivariance function, is called the nugget. The semivariance then increases with distance, typically approaching an upper bound (i.e., an asymptote), which is called the sill and tells us the overall variation in the data after adjusting for possible correlations due to proximity. The semivariance is directly related to the covariance between observations. Thus, the covariance function, describing how the covariance between observations depends on spatial distance, can be analytically derived from the semivariance function as: where cov(Y 1 , Y 2 ) is the covariance between random variables Y 1 and Y 2 and var(Y 1 ) and var(Y 2 ) are their variances ( Figure 4). This covariance function can be used to F I G U R E 2 Semivariograms (exponential model) and kriged maps of total AMF OTU richness in experimental field for the six different dates.
Data were log-transformed (and multiplied by 10) in order to achieve approximate normality and homogeneity of variance. Analysis is based on subplot data. AMF, arbuscular mycorrhizal fungal; OTU, observed taxonomical unit [Color figure can be viewed at wileyonlinelibrary.com] compute the covariance of the response at an unobserved point, Y 0 , with the responses at the observed points, Y 1 , …, Y n . Exploiting this covariance, one can obtain an interpolated estimate of y 0 as a weighted mean of the observed values y 1 , …, y n . The weights minimizing the squared prediction error can be computed from the variances and covariances (Isaacs & Srivastava, 1989). This method of estimation is called kriging, named after the geologist Krige (1951) who invented it in order to estimate (interpolate) the likely amount of metal ore (e.g., gold nuggets) at an unexplored point based on the amounts found in a few surrounding boreholes of a mining operation. If several boreholes are drilled near the same point, right next to each other, the amount of nuggets varies locally, and the variance between these immediately neighbouring boreholes, whose distance is essentially zero, is therefore called the nugget variance. Generally, in other applications, the observed response will vary when measuring at the same spot due to microscale variation of the soil property and measurement error, and the nugget variance can therefore be associated with both of these sources of variation. Such errors should be accounted for also in applications outside of mining.
Most soil scientists will be familiar with these geostatistical methods. The main point to be reiterated here is that if kriging is applied, then usually a spatial covariance is assumed among all measurements and an explicit model is fitted to describe that covariance via the semivariogram (for exceptions see Webster & Oliver, 2007).

| SOME BASICS ABOUT ANALYSIS OF VARIANCE AND RANDOM EFFECTS MIXED MODELS
The assumption of a covariance among observations made in kriging is in contrast to ANOVA procedures for randomized experiments, which assume independence among experimental units (plots). The justification for this independence assumption is based on randomization, whereby treatments are allocated to experimental units at random (Bailey, 2009;Calinski & Kageyama, 2000). A randomized experiment can be analysed by a linear model with independent errors for the experimental units. With a single observation per experimental unit (plot), this leads to a simple ANOVA (Piepho, Möhring, & Williams, 2013). In the case of a completely randomized design (CRD), the underlying linear model just has an intercept, a treatment effect and a residual error term. When a randomized complete block design (RCBD) is used, the model also includes a block effect.
In Example 1, however, there are two observations per plot, and hence some care is needed in order to furnish a valid analysis. The experimental units are still the plots, but there are two subplots per plot, which constitute subsamples. One way to correctly analyse subsampled data is to compute an average per experimental unit and use this as the response (Piepho, 1997), as was done for Table 1. The linear model for a RCBD is: where y ij is the plot mean for the i-th date (treatment) in the j-th block, μ is a general intercept, τ i is the effect of the i-th date, b j is the effect of the j-th block, and e ij is the error associated with plot mean y ij . The plot errors are assumed to be independent, which is justified from randomization. It should be stressed that this independence Note: Data were log-transformed and then multiplied by 10 (Example 1). Analysis is based on plot means and assumes independence between plots. $ Means followed by a common letter are not significantly different according to a t-test at a significance level of α = 5%, based on model (4) (see Section "Some basics about analysis of variance and random effects mixed model"). AMF, arbuscular mycorrhizal fungal; OTU, observed taxonomical unit.
assumption is valid no matter what a semivariogram for the same data would tell us. All that is required really is proper randomization! This fact can hardly be overemphasized. Even if the semivariogram suggests that two adjacent plots are highly correlated, because they tend to be quite similar, we may still proceed with the classical ANOVA based on model (2), which assumes independence among the observations .
Randomization ensures that classical ANOVA will be perfectly valid. We will see in the next section that these two different assumptions, that is, spatial covariance for kriging and independence for classical ANOVA, can be put in a common framework that can help to decide which of the two assumptions is preferable. The key observation here is that both assumptions can be justified, and it's not the case that the one is correct and the other is wrong. We can also model the observed data at the level of subplots. It is tempting to use the model: where y ijk is the observation of the i-th date on the kth subplot on the ij-plot within the j-th block, e ijk is the subplot error associated with y ijk , and the other effects are as defined for model (2). In this model, the subplot errors e ijk are assumed to be independent with zero mean and constant variance σ 2 e . But is this the right model? No, it isn't! This model implies that subplots provide true replication, but subplots are just subsamples or pseudoreplications (Hurlbert, 1984;Piepho, 1997; Box S1). The number of true replications in this experiment is given by the number of plots per treatment (overall, there are 30 plots in this experiment), not the number of subplots (there are 60 subplots). A true replication is given by a plot because plots (not subplots) are the experimental units to which treatments (sampling dates in this case) are independently allocated by the randomization procedure, whereas subplots are the observational units. Both subplots in the same plot receive the same treatment, so subplots are not independent replicates. They just represent subsamples or pseudo-replications.
To account for experimental units (plots), the model for observed data needs to be extended as: where p ij is the random error of the ij-th plot within the j-th block, assumed to be independently distributed with zero mean and variance σ 2 p . This linear model has two sources of random variability, in contrast with classical linear models, which have only one source. The two random effects, that is, the plot error p ij and the subplot error e ijk , make this a linear mixed model. This model properly represents the randomization structure of the experiment as well as the subsampling per plot (Piepho, 1997). The ANOVA presented in Table 3 is based on this model, which can be used to work out the expected mean squares shown in Table 4, and these indicate that the mean square for dates needs to be tested against the mean square for plots, not the residual mean square for subplots. This is because the dates F I G U R E 4 Covariance and semivariance function for isotropic power model with nugget using fit for Example 1: ρ = 0.36, σ 2 s = 1:57, σ 2 e = 2:31. The functions are drawn only for distances >0. At a distance of zero, there is a discontinuity (not shown in the figure), with the covariance jumping to σ 2 s + σ 2 e = 1:57 + 2:31 = 3:88 and the semivariance dropping to zero (see Box 2 and Pinheiro & Bates, 2000, p. 231) [Color figure can be viewed at wileyonlinelibrary.com] F I G U R E 3 Explanation of terms sill, partial sill, nugget and range using fit for November from Figure 2 based on exponential semivariogram model. The number next to a plotted point of the empirical semivariogram denotes the number of pairs (semivariances) on which the point is based [Color figure can be viewed at wileyonlinelibrary.com] mean square has the same expectation as the plots mean square under the null hypothesis of no date effects. Good ANOVA packages will compute these expected mean squares and automatically determine the proper error terms for each effect in the model (Milliken & Johnson, 1992).
The expected mean squares in Table 4 assume balanced data, that is, the blocks are complete and each plot has the same number of subplots. It is important to note that the Ftests for treatments and blocks in Table 3, which are based on an analysis of subplot data using model (4), are identical to the ones obtained for an analysis of plot means based on model (2) as shown in Table 1. The results are identical because the data are balanced and a simple variance components model with independent random effects is used. If a linear mixed model package is used and the model is correctly specified as in (4), the appropriate F-tests for fixed effects will be computed automatically, also with unbalanced data (Piepho & Edmondson, 2018).
It should be stressed that the linear mixed model (4) implies a covariance among the two subplots on the same plot. That covariance is given by the variance σ 2 p of the plot effect, which is shared by the two observations made on the same plot. So far, we have not considered the status of the block effect. This effect may also be modelled as random with zero mean and constant variance (σ 2 b ). This effect induces a further covariance among observations that are made in the same block. Thus, according to this model, there are three levels of covariance: (a) observations in different blocks have a covariance of zero, (b) observations in the same block but on different plots have a covariance of σ 2 b , because these share the same block effect, and (c) observations on the same plot have a covariance of σ 2 b + σ 2 p , because they share not only the same block effect but also the same plot effect. It may be added that the total variance of an observation is σ 2 b + σ 2 p + σ 2 e . These considerations show that the linear mixed model (4) also provides a way to model spatial covariance among observations in the same trial, albeit a somewhat cruder one. Its form is different from the covariance model assumed in kriging, but both represent spatial variances and covariances. The main difference is that kriging assumes that covariance is a smooth function of distance, whereas the randomizationbased model implies a discontinuous change of covariance at the block and plot boundaries. For example, at the block boundary the covariance drops to zero, which may seem unnatural, but again this model is justified purely based on randomization and the non-smooth covariance may also be a good approximation to an assumed underlying smooth covariance function.
In the preceding paragraph, we have taken the mixed model (4) as a point of departure and derived implications for spatial covariance. Conversely, starting with a spatial covariance model for residuals, one could infer the between and within-block and/or plot variance components using Krige's relation (Webster & Oliver, 2007, pp. 60-62), which further emphasizes the close connection between the geostatistical and mixed modelling frameworks.

| JOINING SPATIAL COVARIANCE MODELLING AND ANOVA
The analysis of variance in Table 1 allowed a comparison among dates in terms of mean total AMF OTU richness.

T A B L E 3
Analysis of variance for total AMF OTU richness Note: Data were log-transformed and then multiplied by 10 (Example 1). Analysis is based on subplot data. AMF, arbuscular mycorrhizal fungal; OTU, observed taxonomical unit. a Plots mean square was the error term (see Table 4). b Subplots mean square was the error term (see Table 4).

T A B L E 4
Expected mean squares in ANOVA for total AMF OTU richness (Example 1) under model (4) based on subplot data

Degrees of freedom
Expected mean square Note: r, number of blocks (replicates); t, number of treatments (dates); s, number of subplots per plot. AMF, arbuscular mycorrhizal fungal; OTU, observed taxonomical unit. Such a comparison is not possible by the semivariograms computed separately for each date ( Figure 2). Conversely, the semivariograms allowed modelling the spatial covariance among observations and generating interpolated maps, an option not available with classical ANOVA, which assumes independence among experimental units. Now wouldn't it seem more consistent to have a unified modelling approach that allows both map interpolation and makes mean comparisons?
The key to unifying both approaches is to incorporate the modelling of the spatial covariance into the linear mixed modelessentially by modelling the spatial covariance, that is, the spatial dependence structure of the data (Gilmour, Cullis, & Verbyla, 1997;Pinheiro & Bates, 2000). To this end, we can extend model (4) for Example 1 by including a spatially correlated component as: where s ijk is a random subplot effect representing a spatially correlated trend, whereas the independent residual error e ijk represents the nugget effect and is also associated with subplots. A smooth spatial covariance model is assumed for s ijk . For example, one may assume that the covariance for the effects s ijk of two points that are a distance d apart is σ 2 s ρ d (Figure 4). According to this model, known as the exponential or power model, the covariance decays exponentially with distance, and it does so at the same rate in each spatial direction, which is why the model is called isotropic (Schabenberger & Gotway, 2004). An extension to the isotropic power model is to assume that the decay of covariance differs among the two spatial directions. Such models are called anisotropic. One such model assumes that the covariance among two points is σ 2 where d x and d y are the distances in the x-direction and y-direction. A straightforward further extension of this model allows accounting for temporal distance d t as a third dimension using a time coordinate t (in days), such that the covariance is σ 2 Such a model may be more plausible than a purely spatial one in Example 1 because observations are indeed made at different points in time and a decay of covariance is expected with increasing temporal distance among observations. It should be mentioned here that this simple option for spatio-temporal (space-time) modelling has many contenders (see Final Remarks), some of which are rather more complex and not all of which are as easy to implement with a mixed model package.
We may fit model (5) with different spatial covariance models, as well as several sub-models obtained from that model by dropping random effects, in order to see which model best fits the data. To compare models, the Akaike information criterion (AIC) can be used, with smaller values indicating better fit. For details see Box 1. Model fits are reported in Table 5, along with average standard errors of a difference (s.e.d.) to see how this varies with model choice. The fact that the s.e.d. are quite variable shows that model choice matters with respect to statistical inference (significance tests and confidence intervals). It should also be mentioned for clarity that s.e.d. is no measure of goodness-of-fit and should not be used for model choice. Also, a good model fit based on AIC does not generally mean that s.e.d. will be low compared to other models.
The first three models reported in Table 5 lack random effects for blocks and plots and so do not provide a valid randomization-based analysis, nor do they have a spatial covariance to make up for this lack, so they would not normally be used for analysis, unless the two corresponding variance components converge to zero. The fit statistics (AIC) indicate that the plot effect as well as the block effect is important. When these effects are included, adding a spatial covariance structure hardly improves the fit, even if anisotropy is allowed for. The results also show that a nugget effect is needed when a spatial covariance model is used. Among the purely spatial models, the anisotropic power model with nugget fits best according to AIC. We also fitted a model with all random effects, that is, the randomization-based model with a spatial add-on component. This model fitted best when the anisotropic spatial covariance model was used. Some further improvement was achieved by adding time as a third dimension in the anisotropic covariance structure. This model also has the largest standard error of a difference.
To compare the date means, the ANOVA based on model (4) has the virtue of simplicity. It is derived from the randomization structure of the experiment and as such guarantees a valid analysis, provided the distributional assumptions of normality and homogeneity of variance holdthese can be inspected visually using the residual plots (Kozak & Piepho, 2018). A great advantage in using this model is that no further model choice is needed; one may just rely on randomization theory and use the one model that is induced by randomization. One can also use a spatial model for the purpose (Table 6), which may entail a gain in precision, but this requires selecting among a larger number of candidate models (using, e.g., the AIC), and one can never be sure to have selected the best model. Here, we only considered the power model for brevity, but there are many more models to choose from (Schabenberger & Gotway, 2004; also see Example 2 for a few other models). Thus, when a randomized design has been used, the randomization-based linear mixed model can always be used for analysis, both because of its simplicity and the fact that it is backed by randomization theory. Conversely, a spatial model allows generating maps such as those in Figure 2, whereas this is not afforded by an ANOVA model such as (4), and the same spatial model can also be used for mean comparisons. It is worth mentioning in this context that kriging is equivalent to the standard method for estimating random effects in a linear mixed model, that is, best linear unbiased prediction (BLUP), applied to the spatial trend effect s ijk (Lark, 2009;Robinson, 1991). So, what may have seemed like entirely different modelling approaches initially, can justly be regarded as variations of the same general linear mixed model. Figure 5 shows such kriged maps obtained from the spatio-temporal mixed model described at the bottom of Table 5. Note that in this spatio-temporal model we have intentionally dropped random effects for blocks and plots in order to ensure smoothness of the interpolated map across the whole field. One advantage of kriging based on a spatiotemporal model as used in Figure 5 is that temporal correlation can be accounted for, providing a smooth transition between maps for different time-points and allowing all data to be used for each map, whereas the variograms in Figure 2 ignore such correlations and can only make use of the data measured at a given time-point.

| A SECOND EXAMPLE: A RANDOMIZED FIELD EXPERIMENT WITH MULTIPLE SPATIALLY REFERENCED MEASUREMENTS PER PLOT
We will consider a second example to further illustrate the flexibility of mixed models with spatial modelling of the residual error.

Box 1 Variance-covariance model selection by the Akaike information criterion (AIC)
The Akaike information criterion (AIC) is defined as: where L R is the residual likelihood and q is the number of parameters in the variancecovariance model (Wolfinger, 1996). Small values of AIC indicate the better models among a set of considered models, with a difference in AIC of two or less being negligible, and a difference of more than 10 points giving strong preference to the model with the smallest AIC (Burnham & Anderson, 1998). The assumption underlying the AIC is that there is an unknown model that may be assumed to have generated the data but is too complex to be formulated as a statistical model that can be fitted to data. Thus, we can only hope to approximate the true model using simpler statistical models. The AIC is designed to select the model in a candidate set that best approximates the underlying true model.
Likelihoods for different models are comparable only when the exact same data are used. When comparing different variance-covariance models using the AIC computed from the residual likelihood, it is therefore important to use the same specification for the fixed effects throughout. This is because the residual likelihood is based on data that correspond to contrasts (residuals) removing fixed effects and hence the data will change when the fixed effects model is modified. When models with different fixed-effects specifications are to be compared, the estimation method should therefore be changed from residual maximum likelihood (REML; the default in most mixed model packages) to maximum likelihood (ML) and q would include both the fixed effects and the covariance parameters. (Note: The "nlme" package defines q in this way even with REML. Although this does not affect comparison of models having the same fixed-effects structure, it seems more natural to only count the number of covariance parameters when using REML.) The fixed-effects specification for selecting the variance-covariance model is best chosen to be the most complex model possible or reasonable. In designed experiments with replication, it is preferable to fit the saturated model defining all treatment factors as qualitative. In observational studies, or when covariates are used, it is not generally obvious what the most complex or saturated model is. There are usually competing models, and it is possible that the model used for selecting the variance-covariance structure turns out not to be the final one after the subsequent inspection of the fixed-effects structure. So, there is obviously a chicken-and-egg problem here that is impossible to fully resolve in general, and hence there are different possible variations to this model selection strategy.
Example 2 An on-farm trial with oilseed rape was performed at the experiment station Ihinger Hof of the University of Hohenheim using a randomized complete block design with six replicates (Piepho, Möhring, Pflugfelder, Hermann, & Williams, 2015). There were three different tillage methods (mulch seeding, and strip tillage with or without stubble cleaning) and two fertilizer methods (subsoil fertilization and broadcasting). Five of the six possible factorial treatment combinations were tested (Table 7), whereas one combination was omitted because it was not deemed agronomically relevant.
Yield was recorded on all plots using an online monitoring system, providing a total of 4,569 spatially referenced measurements per plot (WGS84). The plot size was 1,500 m 2 . At each position, a soil fertility score called Ackerzahl was recorded, which accounts for several environmental factors such as soil type, climatic conditions and landscape characteristics (Schachtschabel, Blume, Hartge, & Schwertmann, 1976). The score ranges from 0 to 100, with large values indicating high fertility. The objective of this experiment was to assess the effect of the five treatments as well as of the soil fertility score on yield. The variables available for analysis are shown in Table 8.
Why we need a mixed model: Similar to Example 1, we have two aspects of interest in our dataset: we would like to know the effect of the treatments (tillage and fertilization) on the yield, and we have a spatial heterogeneity (represented as the fertility score) which affects the yield. As a point of departure, to compare treatments only, we may consider an ANOVA-type linear model of the form: where r k is the effect of the k-th replicate, τ j is the main effect of the j-th tillage method, ϕ i is the main effect of the i-th fertilization method, (τϕ) ij is the ij-th interaction effect, and e ijkh is a random residual error for the h-th observation on the ijk-th field plot. Using the variables in Table 8, the same model can be represented equivalently as: where REP stands for r k (REP ≡ r k ), T ≡ τ j , F ≡ ϕ i , and T•F ≡ (τϕ) ij . This symbolic notation is akin to the These models do not provide valid analyses as they neither have effects for blocks and/or plots, nor do they have a spatial covariance structure to make up for this lack. d The spatial model considered here is the isotropic power model with covariance σ 2 s ρ d . When the model is labelled as anisotropic, we fitted the anisotropic power model with covariance σ 2 s ρ dx x ρ dy y . The anisotropic spatio-temporal model has covariance σ 2 s ρ dx x ρ dy y ρ dt t , where x, y and t are the two spatial and the one temporal coordinates (in metres and days, respectively) and d x , d y and d t are the corresponding Euclidean distances. syntax needed to specify a linear model in statistical packages (Patterson, 1997;Wilkinson & Rogers, 1973). We are introducing this symbolic notation here because it is convenient for the statement of some models later in this example. All effects of this model are fixed, apart from a random residual error term that is not explicitly stated in this formulation. The residual error is associated with the spatially referenced measurement points on the plots, of which there are many per plot. The usual assumption of independent errors is not justified here, however, because the treatments (combinations of T and F) were not randomly allocated to the measurement points, of which there are 4,569 in the whole trial, but to the plots, which therefore are the experimental units (Casler, 2015). As a general rule, each experimental unit should be represented by a random effect in the model (Piepho, Büchse, & Emrich, 2003). Thus, we add a random effect p ijk for plots: with PLT ≡ p ijk . In Equation (7b), we have listed the random effect after the colon. All other effects are listed before the colon, which indicates that they are regarded as fixed (Patterson, 1997). In addition to the random effect PLT, there is an independent residual error effect, which is not usually specified explicitly, so there are actually two random effects, one for plots and one for errors associated with the individual measurements. It is the mix of fixed and random effects beyond the residual error that renders model (7) a mixed model (Eisenhart, 1947).
Selecting the variance-covariance structure for the random effects: By including a random effect for plots in the model, a positive covariance among all measurements taken on the same plot is modelled, the covariance being equal to the variance of plot effects. In such a model, all pairs of points in the same plot are equally correlatedno matter their distance from each other, which may not be a tenable assumption. It may be more realisticfor example, if plots are largeto assume that covariance is a decreasing function of spatial distance. For example, the covariance could decrease linearly, exponentially or according to a power function with distance. As a result, there are many spatial or geostatistical variance-covariance models suitable for this purpose, including the spherical, Gaussian, power and exponential models (Isaacs & Srivastava, 1989;Pinheiro & Bates, 2000, Table 5.2 on p. 232; Diggle & Ribeiro, 2007;Webster & Oliver, 2007). Box 2 gives an exemplary explanation of the Gaussian model (Figure 6), illustrating that a model can be parameterized in different ways, for example, in terms of the covariance itself or in terms of a semivariogram. The latter parameterization involves geostatistical terms such as sill, partial sill, nugget effect, range, semivariance function and correlation function (see Section "Some basics about spatial statistics"). Different software packages use different parameterizations of spatial covariance models, so the purpose of Box 2 is to clarify the relations between these different parameterizations and statistical terms.
The plot effects p ijk in (7) are assumed to have a normal distribution with zero mean and variance σ 2 p . These effects are assumed to be independent between plots. Again, this independence assumption is justified because of the randomized allocation of treatments to plots (Hinkelmann & Kempthorne, 1994). Invoking the same randomization argument, we can assume errors e ijkh to be independent between plots, but correlated within plots. Alternatively, we could assume that spatial covariance is also present between errors in different plots, which is not pursued here (but see Example 1).
When both the random plot effect and the residual errors are independent between plots, the responses from different plots are independent as well. They then correspond to "subjects" in terminology that is commonly used with repeated-measures designs, where repeated

1.85
Note: Data were log-transformed and then multiplied by 10 (Example 1). Analysis based on RCBD model accounting for plots (see Table 5). AMF, arbuscular mycorrhizal fungal; OTU, observed taxonomical unit. $ Means followed by a common letter are not significantly different according to a t-test at a significance level of α = 5%, based on subplot data assuming a linear mixed model with a nugget and anisotropic power model with covariance σ 2 s ρ dx x ρ dy y ρ dt t , where x, y and t are the two spatial and the one temporal coordinates (in metres and days, respectively) and d x , d y and d t are the corresponding Euclidean distances. § Average least significant difference was computed for every pairwise comparison separately using the second-order Kenward-Roger method (Kenward & Roger, 2009; see Box S2) and then averaged across all pairs.

Box 2 The Gaussian model of spatial covariance and some alternative parameterizations
Different statistical software packages use different parameterizations (or ways of expressing the same models), which often makes it difficult to compare results. Here we show how linear mixed model outputs from SAS PROC MIXED and the R packages "nlme" and "geoR" can be compared.
As an illustration, we use a spatial model with a Gaussian covariance structure.
According to the Gaussian model, the covariance between errors f 1 and f 2 of two measurements is: where d 12 is the Euclidean distance between the two measurements, σ 2 s is a spatial variance component, and θ is a parameter determining how quickly the covariance decays with distance (it is therefore denoted as the range parameter). The variance of an error f 1 , corresponding to the covariance of two observations when d 12 = 0, equals: where σ 2 e is the nugget variance. Note that (B2) corresponds to a decomposition of the h-th error f h according to: where s h is a spatially correlated component (with variance σ 2 s ) and e h is the independent measurement error (with variance σ 2 e ). For the Gaussian model the semivariance for d 12 > 0 can be derived as: where γ(d, θ) = 1 − exp(−d 2 /θ) is the semivariogram model (Pinheiro & Bates, 2000, p. 230; also see Table 5.2 on p. 232 for typical alternative choices of γ(d, θ)). Thus, the semivariance is a nonlinear function of spatial distance d 12 having intercept equal to the nugget variance σ 2 e and increasing asymptotically to the sill σ 2 = σ 2 s + σ 2 e . The difference between the sill and the nugget, denoted as partial sill, is equivalent to the spatial variance σ 2 s . The semivariogram model is one way to express or parameterize the spatial covariance model, but there are other possibilities. For example, it may be defined in terms of the correlation function h that is continuous in d (Pinheiro & Bates, 2000, p. 231), so that the correlation at d = 0 can be included: where and c 0 (0 ≤ c 0 ≤ 1) is referred to as a nugget effect (this is not the same as the nugget variance σ 2 e in (B4), but note that other authors refer to the variance σ 2 e as the nugget effect (e.g., Diggle & Ribeiro, 2007, p. 56)). The covariance is given by σ 2 h nugget (d, c 0 , θ), where σ 2 is the variance of an observation (it's also the sill in the semivariogram; see above). The corresponding semivariogram model (with nugget included) can be written as (Pinheiro & Bates, 2000, p. 232): We can make the following one-to-one mapping of the two parameterizations regardless of the chosen semivariance function γ(d, θ) (see Pinheiro & Bates, 2000, p. 232, Table 5.2 for common choices): σ 2 e = c 0 σ 2 and ðB8Þ The inverse relations are: measures are taken on the same experimental unit (subject) at different points in time (Piepho & Edmondson, 2018). In such designs, a linear regression on time is often used to model time trends, in which case there is a random intercept and a random slope for each subject (Verbeke & Molenberghs, 2000). The random plot effect in (7) corresponds to a random intercept term if plots are regarded as independent subjects, but the model has no corresponding slope term as there is no time covariate. We mention these connections to repeated measures because some mixed-model packages (e.g., "nlme" and "lme4" in R) require modelling the plot effect as an "intercept", while omitting the slope term, and report the corresponding variance estimate accordingly (see R companion, Box R1), which will be confusing to the novice not familiar with this background. In order to model spatial correlation among observations in the same plot, we insert a spatially correlated effect s ijkh into the model (7): where the variance σ 2 e of the independent error effect e ijkh plays the role of a nugget effect in geostatistical parlance. In order to select a model for spatial covariance, we here use the fixed-effects specification in (8), adding linear and quadratic terms for the covariate AZ (see Box 3 and Table 11), because preliminary scrutiny of the data revealed that a quadratic model might give a good fit. The extended fixed part of the model can be written as:

expected YIELD = REP + T + F + T•F + AZ + T•AZ
The parameterization in terms of σ 2 s and σ 2 e is used in the MIXED procedure of SAS and in the R package "geoR", whereas the parameterization in terms of σ 2 and c 0 is used in the R package "nlme". See Table 10 for an overview of how the outputs relate to each other.
F I G U R E 5 Kriged maps for the six time-points of Example 1 using a spatio-temporal mixed model with nugget and anisotropic power model for the trend effects s ijk , with covariance given by σ 2 s ρ dx x ρ dy y ρ dt t , where x, y and t are the two spatial and the one temporal coordinates (in metres and days, respectively) and d x , d y and d t are the corresponding Euclidean distances. The colour scales are allowed to be different between subplots so that the colour range is about the same in each subplot. BLUP, best linear unbiased prediction [Color figure can be viewed at wileyonlinelibrary.com] Table 9 shows the results for different spatial models with and without nugget variance when errors are assumed to be independent between plots. There is a strong indication that a nugget variance is needed. The Gaussian model with nugget fits best because it has the smallest value of AIC. The spatial coordinates had to be rescaled in some cases in order to achieve convergence. For details regarding trouble-shooting for numerical problems in fitting spatial models see Piepho et al. (2015). The parameter estimates for the Gaussian model with nugget and random plot effect are shown in Table 10, where we compare output from different packages.
Selecting the fixed effects: Having selected a suitable variance-covariance model (i.e., the Gaussian model with nugget and a random plot effect), we now turn to selecting the fixed-effects part of the model. This is done by sequential Wald-type F-tests (Type I tests in SAS; Piepho & Edmondson, 2018). Observing the marginality principle (Nelder, 2000), we fit main effects before the corresponding interactions and lower-order polynomial terms before higher-order terms. Generally in mixed models, the denominator degrees of freedom must be approximated (Halekoh & Højsgaard, 2014;Kenward & Roger, 1997;Kuznetsova, Brockhoff, Christensen, & Pødenphant Jensen, 2019). Here, we use the second-order method of Kenward and Roger (2009) (Box S2 in Supporting Information). The resulting Wald-type F-tests for the fixed effects in model (9) are shown in Table 11.
It emerges that there is no interaction between T and F (terms T•F, AZ•T•F and AZ 2 •T•F are all non-significant). The regression on AZ requires a linear term for

Box 3 Polynomial regression in factorial experiments
Consider a regression on the quantitative variable AZ. A linear regression can be written as: where η(AZ) is the expected response for given value of AZ, β 0 is the intercept and β 1 is the slope. The model assumes that the relationship between expected response η(AZ) and AZ is linear. If the response is non-linear, higher-order polynomial terms may be added. For example, a quadratic polynomial can be written as: where β 2 is the regression coefficient for the quadratic term AZ 2 . It is an important principle (the so-called marginality principle; Nelder, 2000;Piepho & Edmondson, 2018) in polynomial regression for a model with polynomial term up to the p-th degree, all lower-order terms must be retained as well. In the quadratic model this means we cannot drop the linear term. This is because a model of the form η(AZ) = β 0 + β 2 AZ 2 would have an optimum at AZ = 0. Such a response is not usually biologically plausible. For the optimum to be located away from the point AZ = 0, the linear term is required. Similarly, we do not omit an intercept term when the linear term is significant because a regression through the origin is not usually plausible. Note that the intercept can be regarded as the polynomial term of order zero, as β 0 AZ 0 = β 0 × 1 = β 0 .
Model (B13) represents a single curve. If the experiment involves treatment factors, there may be a separate curve for each treatment level. For example, with factors tillage and fertilization method, there can be a separate curve for each combination of tillage and fertilizer method. Hence, all polynomial terms need to be indexed by both factors. Using subscripts i and j for fertilizers and tillage methods, respectively, the extended quadratic model is: For each of the polynomial terms, we may use a two-way factorial model, as in model (6). Thus, we would assume: Plugging these sub-models into (B14), we obtain this model: In symbolic form, and adding a replicate effect, this model can be written as given in Equation (9). factor T (tillage method) and a linear and quadratic term for factor F (fertilization method). Thus, the final fixedeffects part of the model is: We note that the main effects AZ and AZ 2 were omitted here because they are absorbed by the factorial terms T•AZ, F•AZ and F•AZ 2 .
Obtaining the parameter estimates: The parameter estimates for this model are shown in Table 12. Note that we have adhered to the marginality principle here, which requires for a polynomial of given degree to retain all lower-order terms regardless of their significance (Box 3). Thus, for example, having decided to go for a quadratic model including the term F•AZ 2 , we retain the lowerorder terms F•AZ and F even though they are both nonsignificant. This ensures that the maximum or minimum of the fitted quadratic curve is allowed to be removed from the point AZ = 0. Table 12 shows parameter estimates obtained with two different packages (SAS procedure MIXED and R-package "nlme"). The two polynomial regression terms agree, but all other terms have different estimates, due to differences in restrictions imposed on the parameter estimates. These differences are only due to the parameterization, and the fits are entirely equivalent as discussed in Box 4. The fitted model is plotted in Figure 7. Predictions were computed for the average across replicate effects (Box 4). There is a pronounced curvature for broadcast fertilization (F2), whereas the curve is almost linear for subsoil fertilization (F1). Usually, one would expect yield to increase monotonically with the fertility score, but apparently for broadcast fertilization the crop was not able to make full use of the improved fertility. In fact, the weather after sowing was particularly dry and based on visual impression the broadcast fertilizer was not very effective (Wilfried Hermann, Universität Hohenheim, personal communication).

| FINAL REMARKS
Some authors have warned users of the risk of numerical instability when using the Gaussian model for kriging (Webster & Oliver, 2007, p. 119). We did not meet such difficulties with Example 2. Goovaerts (1998) suggested Treatments tested in on-farm trial with oilseed rape and corresponding definition of treatment factors T (tillage method) and F (fertilization method) (Example 2) either fitting a small nugget effect to evade extreme interpolative properties and artefacts or not using the Gaussian model for kriging at all. We would suggest generally fitting a nugget effect in mixed model implementations and have done so in both examples. In our experience, a nugget effect is very often needed, and if it is not, the REML algorithm is likely to converge to zero or some negligible value for the nugget (Piepho et al., 2015). The Matérn model may be considered as a viable and flexible alternative to the Gaussian model. The Matérn correlation function is given by: Box 4 Why do SAS PROC MIXED and R "nlme" and "lme4" print different parameter estimates? Parameterizations of fixed effects for qualitative factors in linear models and computation of marginal models Linear models for qualitative factors are overparameterized, meaning there is no unique solution for the effects. Also, different packages may give widely different numerical solutions for the same model, which may be a source of confusion. Luckily, however, these differences are nothing to be worried about, as all of these solutions are entirely equivalent as regards quantities of interest such as treatment means, treatment differences, predicted values in regressions, etc.
To illustrate, consider a simplified version of model (9), written in classical form as: where y ih is the yield of the h-th observation of the i-th fertilizer treatment, μ is a general intercept, δ i is the i-th fertilizer effect, and e ih is an error term associated with y ih . The tangible quantities of interest in this model are the two treatment means (i.e., the average yield for treatment 1, η 1 = μ + ϕ 1 , and for treatment 2, η 2 = μ + ϕ 2 ). The model is overparameterized because there are only two means (η 1 and η 2 ), but three parameters (μ, ϕ 1 and ϕ 2 ). This is solved by imposing a restriction on the parameter estimates, such as setting a treatment effect equal to zero (R packages set the first of the treatment effects to zero, whereas SAS packages set the last one to zero.). Whichever restriction we impose, the resulting treatment means are the same. For illustration, assume that the means are η 1 = 50 and η 2 = 60. When we impose the restriction ϕ 1 = 0, these means are obtained with μ = 50 and ϕ 2 = 10. When the restriction is ϕ 2 = 0, we must have μ = 60 and ϕ 1 = − 10 to obtain the same means.
There are subtle differences in the way these restrictions are implemented in different packages, and we mention these differences because they do have important consequences as regards the printed output. R packages "nlme" and "lme4" effectively drop the effect(s) that are set to zero from the model, thereby rendering the design matrix for fixed effects, usually denoted as X, of full column rank, implying that regular matrix inverses can be used to solve the normal equations/mixed model equations for the effects remaining after dropping the effects that have been set to zero. The important practical consequence of this approach is that the effects that are set to zero do not appear in the printed output. By contrast, SAS retains all parameters and uses so-called generalized inverses to solve the normal/mixed model equations. Thus, solutions are printed for all parameters, including the ones that are set to zero. This is seen in Table 12, where the zero effects are represented by dashes for the output of "nlme", whereas there are printed zeros for such effects on the output of SAS/MIXED. Now consider the fitted model (9) for Example 2. It is readily verified from Table 10 that, despite the numerically very different solutions, the important quantities derived from these solutions are the same, including effect differences between factor levels. For example, the difference between the effects of replicates 2 and 3 (REP2-REP3) equals 0.042 with both sets of solutions. Similarly, F1-F2 = 3.0816 and AZ•F1-AZ•F2 = −0.1168 with both solutions.
In classical notation the model (10) has a separate intercept for each replicate and treatment combination, given by μ 0 + r k + τ 0j + ϕ 0i . To report a single model per treatment combination, it is useful to average this across replicate effects, yielding the mean intercept μ 0 + r • + β 0j + δ 0i , where r • = r 1 + r 2 + r 3 + r 4 + r 5 + r 6 ð Þ =6 . This average intercept was used to draw the five curves for the five treatment combinations in Figure 7. Again, it is readily verified that the mean intercepts are the same with both sets of solutions printed in Table 12.
where ϕ > 0 is the range parameter, v > 0 is the smoothness parameter, Γ(.) is the gamma function, and K ν (.) is the modified Bessel function of the third kind of order ν (Diggle & Ribeiro, 2007, p. 51-53). This model has well-known limiting cases, such as the exponential ν = 1 2 À Á and the Gaussian (ν ! ∞). The model is notoriously difficult to fit because the parameters are far from orthogonal. Diggle & Ribeiro (2007, p. 114) therefore suggest profiling the likelihood over a grid of values for the smoothness parameter ν. We did this for the data in Example 2, using the grid ν = 1, 2, …, 15 and found a fit of AIC = 3,140.8 (ν = 1) for the model without nugget and AIC = 2,058.3 (ν = 5) for the model with nugget. The fit for the model with nugget is marginally better than the Gaussian model in Table 9 (a difference smaller than two points, indicating equally good fit; see Box 1). Provision of good starting values was essential to get the Matérn models to converge. In this example, the Gaussian model may be preferred because it is simpler to fit and only a little different from the Matérn model.
Randomization in designed experiments justifies the assumption of independence among experimental units in ANOVA. This paper considered two experiments where several observations were made per experimental unit. Such subsampling data cannot be modelled as independent observations but require accounting for correlation among observations made on the same experimental units. Geostatistical covariance models can be used for this purpose, as illustrated using two examples. It should be stressed, however, that although geostatistical modelling can to some extent make up for lack of randomization, such modelling is no substitute for randomization.

T A B L E 9
Selection of variance-covariance model for Example 2 assuming the fixed-effects specification (9) (see Table 10 Note: In all spatial models, covariance is assumed to be present between measurements taken on the same plot, but observations from different plots are assumed to be independent. All models include a random plot effect as in Equation (8). Models fitted with the MIXED procedure of SAS. AIC, Akaike information criterion. a Covariance model σ 2 s exp − d=θ ð Þ; equivalent to power model (σ 2 s ρ d ) with ρ = exp(−1/θ). b The spatial coordinates LAT and LONG had to be multiplied by 10 4 to achieve convergence. c The spatial coordinates LAT and LONG had to be multiplied by 10 2 to achieve convergence.  Note: Note that σ 2 e = c 0 σ 2 and σ 2 s = 1− c 0 ð Þσ 2 (see Box 2). Results are for the Gaussian model of spatial covariance, assuming that errors are independent between plots. Note that "geoR" reports the nugget variance σ 2 e as "nugget" and "nlme" reports the nugget effect c 0 as "nugget". a This is the label printed with the Gaussian modelit has a different name if a different covariance structure is selected. b The "nlme" package reports this as a standard deviation, which is the square root of the variance. c "geoR" and "geoRglm" cannot fit random intercepts and they cannot model errors to be independent between plots (subjects), so no results can be reported here for these R packages. d In SAS, q = 4 is the number of variance parameters. In R package "nlme", the parameter count includes the fixed effects: q = 19.

T A B L E 1 2
Parameter estimates for final model (10) using the MIXED procedure of SAS a and the lme() function of the "nlme" package in R parameterization is used. This can be invoked using options (contrasts = c(unordered = "contr.SAS", ordered = "contr.poly")).
b Factors described in Table 8.
T A B L E 1 1 Sequential (type I) Wald-type F-tests for yield (t/ha) in Example 2 using model (9) with linear and quadratic regression terms for Ackerzahl (AZ) (see Table 12) assuming a Gaussian model with nugget variance for the residual error and a random plot effect Note: Tests were obtained with the MIXED procedure of SAS c . df, degrees of freedom. a Factors described in Table 8. b Approximated using the second-order method of Kenward and Roger (2009).
c Several of the F-tests obtained with the "nlme" package (see R companion) for this example differ from those found with SAS because of differences in the computational method, which are relevant for models with spatial or temporal covariance, as explained in Appendix 3 of Piepho and Edmondson (2018).
Even when only a single observation is modelled per experimental unit and the independence assumption is justified based on randomization (Webster & Lark, 2019), it may still be beneficial to fit a spatial model, as this may improve efficiency (Gilmour et al., 1997). The potential for improved efficiency in comparative experiments stems from the fact that inference is based on pairwise differences among observations and that the variance of a pairwise difference among two random variables Y 1 and Y 2 decreases linearly with the spatial covariance (i.e., var (Y 1 − Y 2 ) = var(Y 1 ) + var(Y 2 ) − 2cov(Y 1 , Y 2 ); also see Equation (1)). If such spatial modelling turns out to not provide an improved fit, or is deemed to be too complex, the randomization-based model can always be used as a fallback option for analysis, providing an unbiased estimation of treatment effects. Thus, in designed experiments the randomization principle should be adhered to whenever possible, even when use of a geostatistical covariance model is envisioned for analysis .
When contemplating the use of spatial models for analysis of a designed experiment, it should be kept in mind that switching from a design-(randomization-) to a model-based analysis will change results. Although the former entails assumptions, these are relatively straightforward and well known (Webster & Lark, 2019). The use of a spatial model imports further assumptionsstationarity of the spatial correlation, in particular. Thus, estimates of treatment means may differ slightly between both analyses, especially when spatial correlation is present over a distance similar to or longer than the length of an experimental field. Furthermore, as discussed by de Gruijter et al. (2006), the standard errors of means behave differently. This is exemplified by the different values for the s.e.d. in Table 5 (Example 1). Furthermore, with spatial analysis there is difference between estimates of the "model mean" (e.g., μ + τ i in Equation (5)) (this is equivalent to best linear unbiased estimation) and the "spatial mean", that is, the prediction of realizations of a random variable across a defined area (de Gruijter et al., 2006) (this is equivalent to BLUP). Here, we have solely considered the former, which is relevant for assessing treatment differences in a designed experiment.
This paper has focused on randomized designed experiments, but it should be emphasized that surveys and observational studies have a similar theme, with random sampling replacing the role of randomization. As with randomization in designed experiments, random sampling justifies the independence assumption in analysis (de Gruijter et al., 2006;de Gruijter & ter Braak, 1990). When sampling is not random but purposive or systematic, spatial modelling, integrated within a linear mixed modelling framework, may be used for a valid analysis (Lark & Cullis, 2004;Wiesmeier et al., 2012Wiesmeier et al., , 2013. It should be emphasized, however, that as with randomized experiments, surveys should be designed using the principle of random sampling whenever feasible, because this allows a design-based analysis, which has the virtue of simplicity and robustness. Also, adherence to the random sampling principles minimizes the risk of biased estimates of effects of interest. As with randomized experiments, model-based spatial analysis can always be tried, and this has the potential to improve precision over the purely designbased analysis, but should a suitable spatial model be difficult to identify, one can always use the designbased analysis. Time series data (longitudinal data, repeated measurements) have commonalities with spatial data in that correlation among observations made at different timepoints needs to be modelled. In fact, the models used for time series data bear much resemblance to models used for spatial data, and spatial covariance models are well suited to model temporal correlation, especially when time-points are not equally spaced (Pinheiro & Bates, 2000). Moreover, spatial and temporal covariance can be combined into a spatio-temporal model as shown in Example 1 ( Figure 5). Here, we have only used the anisotropic power model for spatio-temporal mixed modelling. More sophisticated mixed models of this type are considered, for example, in Schabenberger & F I G U R E 7 Plot of model (9) fitted to data of Example 2, assuming a Gaussian model with nugget and a random effect for plots. Factor levels for treatments are described in Table 7 [Color figure can be viewed at wileyonlinelibrary.com] Gotway (2004, Chapter 9) and de Faveri, Verbyla, Cullis, Pitchford, and Thompson (2017).
In Example 2 we have considered data obtained from yield mapping sensors based on flow-rate measurements in a combine harvester (New Holland, header width of 6 m). The analysis was performed making the simplifying assumption that monitored yields are actual yields. In reality, the monitored yield data represent a convolution of yields from different points simultaneously arriving at the recording device. Different methods may be applied to deal with this problem, such as specialized deconvolution methods (Lark & Wheeler, 2003) or preprocessing of the raw data by smoothing methods (Marchant et al., 2019). Wolfinger, R. D. (1996)