### Summary

- Top of page
- Summary
- 1. Introduction
- 2. Lattice of figures and area prediction
- 3. Lattice of points
- 4. Lattice of figures
- 5. Simulations
- 6. Comparisons of sampling schemes
- 7. Discussion
- References
- Appendix

Total planar area can be estimated based on sampling by a lattice of figures (e.g. point patterns, line segments, quadrats). General formulae are provided for the approximation of mean squared errors. The approximation formulae are products of the boundary length and of a parameter that depends only on the sampling scheme. An R package is provided by the authors for the numerical computation of the mean squared error formulae. The speed of convergence of the mean squared error approximation is assessed on the basis of several simulations. Several sampling schemes are compared in view of the approximated mean squared errors.

### 1. Introduction

- Top of page
- Summary
- 1. Introduction
- 2. Lattice of figures and area prediction
- 3. Lattice of points
- 4. Lattice of figures
- 5. Simulations
- 6. Comparisons of sampling schemes
- 7. Discussion
- References
- Appendix

The total area of a planar structure can be estimated from partial observations using standard tools of sampling theory. Common sampling probes are finite sets of points, lines, quadrats, etc. Measurements to be performed are counts, length or area measurements. In most cases, the sampling probes or units are systematically distributed on the plane. The whole sampling device is a lattice of figures. In microscopy, a figure corresponds to a sampling probe as seen in a single field of vision. The whole lattice of figures is obtained by systematic displacements of the field of vision.

Under the standard assumption that the lattice of figures is uniformly randomly translated, the total area estimator is unbiased. However, the precision of the area estimator depends both on the sampling scheme and on the spatial distribution of the structure of interest.

An approximation formula has been proposed by Kendall (1948) for the mean squared error (MSE) of the area estimator based on sampling by a lattice of points. Kendall's formula converges when the lattice density tends to infinity and it depends only on the curvature along the structure boundary. Kendall's formula has been refined by Matheron (1971): the MSE approximation can be decomposed into the so-called extension term and the oscillating term. Furthermore, under an isotropy assumption, the extension term depends only on the boundary length. For further discussion of the Kendall–Matheron formula, see, for example, Gundersen & Jensen (1987) and Matérn (1989). Recently, it has been proved that under some regularity conditions, the oscillating term of the MSE is of higher order than the extension term if the structure is random (Kiêu & Mora, 2004). Note that the randomness condition makes sense in most biological applications.

In this article, the Kendall–Matheron formula for sampling based on lattices of points is extended to other lattices of figures. This extension is obtained by using grading and regularization as defined by Matheron (1971). Unbiased area estimation based on sampling by lattices of figures is introduced in section 2. The Kendall–Matheron formula for point lattice sampling is given in section 3. The new formulae for general lattices of figures are provided in section 4. The performance of the MSE approximations is discussed in view of some simulations in section 5. Section 6 is devoted to the comparison of various sampling schemes.

### 2. Lattice of figures and area prediction

- Top of page
- Summary
- 1. Introduction
- 2. Lattice of figures and area prediction
- 3. Lattice of points
- 4. Lattice of figures
- 5. Simulations
- 6. Comparisons of sampling schemes
- 7. Discussion
- References
- Appendix

The structure of interest is a random compact set **X** in the two-dimensional plane ℝ^{2}. The parameter to be approximated is the area of **X**. Since here the area of **X** is supposed to be random, below we refer to area prediction instead of area estimation.

In order to predict the area **A** of **X**, the random compact set **X** is sampled by means of a lattice of figures (planar subsets). Examples of lattices of figures are provided in Fig. 1. In a lattice of figures, the figures differ only by translations, and the set of such translations is a vector lattice. Any lattice of figures can be represented as

where Λ is a vector lattice and *F*_{2} is a planar subset. In the examples of Fig. 1(a–d), the vector lattice is two-dimensional and the figure *F*_{2} is compact. For the lattice of lines and the lattice of strips shown in Fig. 1(e–f), the vector lattice is one-dimensional and the figure *F*_{2} may be decomposed as

where *L* is the line orthogonal to Λ, and *F*_{1} is a compact subset of the line supporting Λ. When *F*_{1} is a single point, *F*_{2} *= L + F*_{1} is the line parallel to *L* through *F*_{1}. When *F*_{1} is a segment, *F*_{2} *= L + F*_{1} is a strip. The density of the lattice Λ is defined as the inverse of the area (or length) |Λ| of a fundamental tile of Λ. A unit lattice is a lattice with density equal to 1. The approximation formulae provided below involve dual vector lattices. Two vector lattices are dual if the scalar products of their vectors are integers. For more details about lattice theory, see Conway & Sloane (1999). The unit square lattice is self-dual. The dual of a hexagonal lattice is also hexagonal. The dual of a lattice with a rectangular tile of side lengths *l*_{1} and *l*_{2} is the lattice with a rectangular tile of side lengths 1*/l*_{1} and 1/*l*_{2}. More generally, dual lattices have inverse densities. The dual of a lattice Λ is denoted Λ*.

Consider a randomly translated lattice of figures

**Λ** + *F*_{2} = Λ + **U** + *F*_{2}

where **U** is a random translation vector uniformly distributed in a fundamental tile of Λ. The random lattice of figures samples the plane uniformly, and the sampling density of **Λ** *+ F*_{2} is given by

where |*F*_{i}| denotes the ‘content’ of *F*_{i} (number of points, length or area, depending on the dimension of *F*_{i}).

- (1)

is unbiased (conditionally on **X**). The precision of **Â** can be quantified through the MSE defined as

- (2)

### 3. Lattice of points

- Top of page
- Summary
- 1. Introduction
- 2. Lattice of figures and area prediction
- 3. Lattice of points
- 4. Lattice of figures
- 5. Simulations
- 6. Comparisons of sampling schemes
- 7. Discussion
- References
- Appendix

Consider the special case where *F*_{2} consists of a single point, i.e. **Λ*** + F*_{2} is a simple lattice of points. Following Kendall (1948), the MSE can be expressed as

- (3)

where the prime symbol to the right of the summation denotes that the origin is excluded from the summation, and the power spectral density PSD_{X} of **X** is defined as the Fourier transform of the geometric covariogram of **X**. Following Matheron (1971), the geometric covariogram is the function

*h* E[|**X** ∩ **X** + *h*|], *h* ∈ ℝ^{2}.(4)

It can also be expressed as the mean convolution product of the indicator function of **X** and its reflexion with respect to the origin.

Since the densities of Λ and its dual Λ* are inverse, the asymptotic behaviour of the MSE when the sampling spacing tends to 0 depends only on the behaviour of the spectral density far from the origin. Using an asymptotic approximation of the spectral density and assuming that the boundary of **X** is isotropically distributed, the Kendall–Matheron formula states that

- (5)

where *B* is the mean boundary length of **X**, is a unit version of Λ*, and *Z* denotes the Epstein zeta function.

The Epstein zeta function is a multidimensional extension of the Riemann zeta function defined by

- (6)

When the phase *h* = 0, the notation *Z*_{Λ}(*s*) = *Z*_{Λ}(*s*, 0) is used. Note that in Eq. (5), (3) is scale-invariant: it depends only on the lattice shape.

The isotropy assumption is quite restrictive. However, Eq. (5) holds even in the anisotropic case, provided that the sampling grid is isotropically randomly rotated. For a square lattice, Eq. (5) yields

Note that in Matheron (1971) and Gundersen & Jensen (1987), the multiplicative constant is given as 0.0724. This is because Matheron used a numerical approximation based on the Chowla–Selberg expansion of the Epstein zeta function (Chowla & Selberg, 1949). The multiplicative constant provided in Matérn (1985) is the same as in Eq. (5). Values of the Epstein zeta function must be computed numerically. Direct computations based on Eq. (6) are not efficient, because the summand converges very slowly towards 0. Alternatives are algorithms based on the Chowla–Selberg expansion or on the incomplete gamma function expansion; see Crandall (1998) for the latter approach.

The convergence rate of the spectral density of **X** is related to the smoothness of its geometric covariogram near the origin. Derivations of MSE formulae as given in Gundersen & Jensen (1987) and in Cruz-Orive (1989) are based on local models for the covariogram near the origin. For the general case where *F*_{2} does not reduce to a single point, the measurements involved in the area predictor (1) can be considered as convolution products, which are simpler to handle in the Fourier space. This simplification is used in the next section, where MSE formulae are extended to general lattices of figures.

### 5. Simulations

- Top of page
- Summary
- 1. Introduction
- 2. Lattice of figures and area prediction
- 3. Lattice of points
- 4. Lattice of figures
- 5. Simulations
- 6. Comparisons of sampling schemes
- 7. Discussion
- References
- Appendix

The MSE approximations provided in section 4.1 converge when the fundamental tile of the sampling lattice tends to a single point. At the moment, there is no theoretical result concerning the speed of convergence of the approximations. In this section, this problem is investigated using simulations. We consider three different random sets. The first random set **X**_{1} has a simple shape. The two others, **X**_{2} and **X**_{3}, have more complex shapes, due to small-scale spatial variation. The latter is more regular than **X**_{2} because of stronger spatial autocorrelations. Two sampling schemes are considered: sampling by hexagonal lattices of points or point patterns.

The random sets are obtained by thresholding nonstationary random fields. The geometry of a random set is controlled both by the mean and by the covariance function of the random field. The mean µ is taken as a sum of three Gaussian densities (Fig. 2a). The random set **X**_{1} is obtained by thresholding at a level *t* > 0 the nonstationary random field

where **Z**_{1} is a centred Gaussian random field with a Gneiting covariance function (Fig. 2b,d).

The random set **X**_{2} is obtained by thresholding at level *t* a random field of the following type:

- µ + min(
**Z**_{1}, *t*) + **Z**_{2},(19)

where **Z**_{2} is a centred Gaussian random field with a Gneiting covariance function. The covariance parameters are chosen such that the spatial variation of **Z**_{2} is small-scale compared to **Z**_{1} (Fig. 2c,e). A similar construction is used for the random set **X**_{3}, except that the covariance of the random field with small-scale variation is a Bessel function. The parameters of the Bessel function are chosen such that the mean areas and the mean perimeters of **X**_{2} and **X**_{3} are close. However, **X**_{3} tends to be much more regular than **X**_{2}, due to stronger short-range autocorrelations (Fig. 2e,f).

Note that the simulated random sets are not isotropic, since the deterministic function µ is anisotropic. Statistics on the area and the perimeter of **X**_{1}, **X**_{2} and **X**_{3} computed from 3000 replications are provided in Table 1.

Table 1. Statistics for the area and the perimeter of **X**_{1}, **X**_{2} and **X**_{3}. Random set | Area | Perimeter |
---|

Mean | Variance | Mean | Variance |
---|

**X**_{1} | 17.95 | 5.85 | 18.02 | 2.72 |

** X **_{2} | 8.92 | 2.18 | 53.67 | 51.45 |

** X **_{3} | 9.07 | 1.63 | 52.60 | 42.94 |

The simulations of the random fields have been carried out using the R package randomfields (Schlather, 2001).

Two types of sampling scheme are applied to the random sets: sampling by lattices of points and sampling by lattices of point patterns. The lattice Λ is a hexagonal lattice. The pattern contains five points (see Fig. 4a), and it spans a square of side length 0.2. The spacing between the figures (point or pattern of five points) varies from 0.90 to 0.23. For **X**_{1}, when sampled by a lattice of points, the mean total point count increases from 20 to 300.

For each random set **X**_{j} and each sampling scheme, the true MSE has been estimated from 3000 independent realizations of (**Â**, **A**). The obtained empirical MSEs are compared to the asymptotic approximations in Fig. 3. Note that the estimated MSEs for a given **X**_{j} are not independent, since they are based on the same set of 3000 realizations of **X**_{j}.

In view of the results shown in Fig. 3, the MSE approximation performs similarly for the two types of lattices of figures.

The MSE approximation yields fairly good results for the simple shape random set **X**_{1}, even for rather small sample sizes. The worst relative error is about 7%. For **X**_{2}, the MSE approximation does not perform as well as for **X**_{1}, but the worst relative error is about 25%. When the sampling spacing is less than 0.45, the relative error is less than 6%. The true MSE curve for the random set **X**_{3} is not monotone: there is a peak for small sampling densities. This peak is not captured by the asymptotic approximation. However, for larger sampling densities, the MSE approximation turns out to be rather close to the empirically estimated MSE. As for **X**_{2}, when the sampling spacing is less than 0.45, the relative error is less than 6%.

### 6. Comparisons of sampling schemes

- Top of page
- Summary
- 1. Introduction
- 2. Lattice of figures and area prediction
- 3. Lattice of points
- 4. Lattice of figures
- 5. Simulations
- 6. Comparisons of sampling schemes
- 7. Discussion
- References
- Appendix

As noted in section 4.3, sampling scheme performances can be compared independently of the random set **X**. Numerical comparisons are obtained from the MSE approximation formulae. Without loss of generality, the mean boundary length *B* is set to 1 and we focus only on unit lattices Λ.

Below, we compare different types of lattices of point patterns. We first consider three particular point patterns, as illustrated in Fig. 4. The point patterns are rescaled in order to fill square windows, respectively, of side length 0.1, 0.3, and 0.5.

The comparison is made for four types of two-dimensional unit lattices: hexagonal, square, rectangular 1 (2 × 0.5) and rectangular 2 (4 × 0.25). All values of the approximate MSE are given in Table 2.

Table 2. Approximate values of MSE for four unit point lattices, three point patterns and three window side lengths. Window side length | Number of points | Approximate values of MSE for four lattices |
---|

Hexagonal | Square | Rectangular 1 | Rectangular 2 |
---|

– | 1 | 0.071701 | 0.072837 | 0.181599 | 1.253845 |

0.1 | 5 | 0.049602 | 0.050722 | 0.158423 | 1.179515 |

8 | 0.050988 | 0.052112 | 0.160432 | 1.174895 |

9 | 0.050648 | 0.051772 | 0.159657 | 1.185250 |

0.3 | 5 | 0.020014 | 0.020822 | 0.118961 | 1.100436 |

8 | 0.021602 | 0.022525 | 0.126178 | 1.114878 |

9 | 0.020769 | 0.021715 | 0.121904 | 1.115659 |

0.5 | 5 | 0.008077 | 0.007887 | 0.087608 | 0.983413 |

8 | 0.007113 | 0.007263 | 0.097979 | 1.024359 |

9 | 0.005952 | 0.006282 | 0.089554 | 1.014918 |

The first line of Table 2 shows the approximate MSE values for the four simple point unit lattices. These values are given in Matérn (1989), where these four systematic sampling schemes are compared. The minimum value is obtained for the hexagonal lattice. This illustrates the general optimality result given in Rankin (1953), in which the author demonstrates that for all *s* ≥ 1.035, the hexagonal lattice minimizes the Epstein zeta function *Z*_{Λ}(*s*). Hence, the hexagonal lattice is optimal among all point lattices with a given density.

Also note that for window sizes 0.1 and 0.3, the MSEs are quite close for patterns of 5, 8 and 9 points. For such small figure sizes, increasing the number of points per pattern is not an efficient way to improve the precision of the area predictor.

For the different point patterns considered, the approximate MSE values obtained for the hexagonal and square lattice are quite similar. Compared to these two lattices, the performance of the rectangular lattices is quite poor.

Next, for a given lattice of points Λ, a square window *W* of given side length *l* and a given number of points *n,* an optimal pattern of *n* points included in *W* can be computed by optimizing the MSE approximation. The results shown in Fig. 5 are obtained using the L-BFGS-B optimization method, which allows box constraints (Byrd *et al*., 1995). Since this method uses a quasi-Newton algorithm, it yields only a local minimum that depends on the provided initial pattern. For each optimal pattern search, 20 L-BFGS-B optimizations were carried out, starting from randomly chosen initial patterns.

When the side length of the window is small compared to the tile of the lattice, the optimization procedure gives optimal patterns with points located at the corners of the window *W*. For *n >* 4, this yields several points at the same corner. This is the case for *l* = 0.1, as seen in Fig. 5. To give a more complete picture of the case *l* = 0.1, the optimal patterns and associated approximate MSE values obtained for *n* = 2, ... ,5 are shown in Fig. 6.

From these results, one can see that for *n* varying from 1 to 7, the optimal pattern is obtained for *n* = 4. Adding a fifth point increases the MSE value.

Sometimes, lower-dimensional probes are more efficient than higher-dimensional probes. Jensen & Gundersen (1981) considered the example of a disc *X* sampled by an isotropic uniform random (IUR) unit square and by the pattern consisting of the four corners of the IUR square. It turns out that for sufficiently large values of the disc radius, the point pattern probe performs better than the square probe. This situation, where point counting is more efficient than area measurement, has been investigated and exemplified in Baddeley & Cruz-Orive (1995) for stationary random sets. The authors related this paradox to Smit's paradox appearing in the theory of random fields.

Below, we compare sampling by lattices of quadrats and sampling by lattices of square four-point patterns. The common side length of the quadrats and of the four-point patterns varies from 0.1 to 0.9. The resulting MSE values are plotted in Fig. 7. These results give a new illustration of Smit's paradox: point counting performs better than area measurement for sufficiently small windows.

Finally, let us consider sampling schemes based on segments. The approximation in Eq. (7) can be used in order to compute the optimal segment orientation. As expected, the optimal orientation is given by the diagonal of the sampling point lattice tile. The worst orientation corresponds to a segment parallel to one of the sides of the lattice tile. Table 3 shows the maximal and minimal approximate MSE values for various segment lengths, for both the square and the hexagonal unit lattices.

Table 3. Approximate MSE values for the best and worst segment orientations. The segment length is varying from 0.1 to 0.9. Lattice shape | Orientation in degrees | Length segment *l* |
---|

0.1 | 0.3 | 0.5 | 0.7 | 0.9 |
---|

Square | 0 | 0.062744 | 0.045675 | 0.032810 | 0.024213 | 0.019919 |

45 | 0.062743 | 0.045600 | 0.032250 | 0.022193 | 0.014877 |

Hexagonal | 0 | 0.061611 | 0.044528 | 0.031491 | 0.022373 | 0.017095 |

30 | 0.061611 | 0.044525 | 0.031435 | 0.021981 | 0.015486 |

### 7. Discussion

- Top of page
- Summary
- 1. Introduction
- 2. Lattice of figures and area prediction
- 3. Lattice of points
- 4. Lattice of figures
- 5. Simulations
- 6. Comparisons of sampling schemes
- 7. Discussion
- References
- Appendix

In this article, we provide approximation formulae for the MSE of several stereological area predictors. These approximations converge when the sampling density tends to infinity. Simulations show that the convergence is faster for special structures with a simple shape. The MSE approximations depend only on the boundary length. As shown in the example in section 5, the MSE approximation may perform poorly when the structure presents strong spatial autocorrelations. In such a case, one may try to derive the MSE from a continuous geometric covariogram model. However, it is still an open problem how to characterize the family of geometric covariograms (Matheron, 1993).

The general MSE approximation in Eq. (7) holds when the boundary of the spatial structure is isotropically orientated. The simulations show that the approximation performs well when the anisotropy is not too pronounced and the lattice of figures is hexagonal. In the case of strong anisotropy, the lattice of figures should be randomly orientated. Also note that Eq. (9) extends to the anisotropic case. The extended formula involves the rose of normal directions to the boundary; see Kiêu & Mora (2004) for the asymptotic approximation of the spectral density in the anisotropic case.

The MSE approximation formulae can be used to assess the precision of stereological area predictions, provided that the mean boundary length is available. In practice, both area and boundary length can be predicted by combining point and line segment sampling; see, for example, Gundersen & Jensen (1987).

Owing to their simple structure, the MSE formulae can be used in sampling design. In particular, sampling parameters such as the spacing between figures or the number of points per point pattern can be computed such that the sampling scheme achieves an expected coefficient of error.

Furthermore, it is possible to compare sampling schemes independently of the structure under investigation. As shown in section 6, when the window (field of vision) is small compared to the lattice tile, the pattern of the four corner points performs better than other probes. Using more points or measuring the area inside the window does not improve the precision of area prediction.

The extension to volume prediction may seem straightforward. The general convergence result for the spectral density of a random compact set given in Kiêu & Mora (2004) holds in spaces with arbitrary dimensions. A set of simple MSE formulae for volume predictors is available in Kiêu & Mora (2005). However, in practice, sampling of three-dimensional structures involves complex nested sampling schemes. In general, top-level stages involve physical sectioning, and physical slabs or blocks are sampled independently. Hence, the whole sampling scheme combines stratified and systematic sampling. MSE formulae for this type of sampling scheme are not yet available, but they can be derived from existing mathematical tools.

### Appendix

- Top of page
- Summary
- 1. Introduction
- 2. Lattice of figures and area prediction
- 3. Lattice of points
- 4. Lattice of figures
- 5. Simulations
- 6. Comparisons of sampling schemes
- 7. Discussion
- References
- Appendix

The approximation formulae in section 4.1 are based on a convergence result for the spectral density of **X**. This convergence result holds under the regularity conditions given in Kiêu & Mora (2004). When the normal directions to the boundary of **X** are isotropically distributed, the asymptotic behaviour of the spectral density is given by

- (20)

when ‖ *y *‖ tends to infinity. The result in Eq. (20) can be extended to the anisotropic case by multiplying the right-hand side by *r*(ω), ω = ‖* y *‖^{ −1}*y*, where *r* is the rose of unoriented normal directions to the boundary of **X** (*r*≡ 1/π for isotropically distributed normals).

A similar result has been obtained in the case where **X** is a deterministic convex body (Kendall, 1948). However, in this case an oscillating term of the order of ‖* y *‖^{ −3} must be added in the approximation of the spectral density.

Furthermore, the convergence of the spectral density is closely connected to a result due to Matheron (1971) concerning the behaviour of the geometric covariogram of **X** near the origin. For the sake of simplicity, we assume that the random compact set **X** is isotropic. Following Matheron (1971), p. 36, the geometric covariogram near the origin can be approximated by

- (21)

The approximation (21) is proved in Matheron (1975) for deterministic convex bodies. The extension to random bodies is straightforward. Note that in view of Formula (21), the covariogram is not differentiable at the origin. Now let us assume that the geometric covariogram is continuously differentiable outside the origin. Then it is easy to prove by standard Fourier calculus that the convergence result in Eq. (20) holds for the spectral density.

Let us consider the case of a compact figure *F*_{2}. By regularization, the spectral density of the measurement function

In the formula above, the ratio is scale invariant with respect to *F*_{2}, and the spectral density of *F*_{2} is equal to the Fourier transform of its geometric covariogram. Hence the prediction MSE can be written as

Finally, the convergence result in Eq. (20) yields the approximation formula

Next consider the case of sampling by a lattice of lines. The spectral density of the graded measurement function

*x* |**X** ∩ *L* + *x* |

is equal to the restriction of the spectral density PSD_{X} to *L*^{⊥}.

For a general figure *F*_{1}, the spectral density of the regularized and graded measurement function can be expressed as

with *y* ∈ *L*^{⊥}. MSE[Â] is equal to the sum of the spectral density of the measurement function over the dual lattice Λ*. Using the convergence result in Eq. (20), we get the approximation formula in Eq. (9).