### Summary

- Top of page
- Summary
- 1. Introduction
- 2. Examples of systematic sampling in stereology
- 3. The transitive methods
- 4. Regularity conditions
- 5. The relevant variance approximation depends on the sampling density
- 6. Measurement error
- 7. Resampling from a large sample
- 8. Analysis of 20 data sets
- 9. Discussion of earlier methods of analysis
- 10. An example and some practical recommendations
- 11. Open problems and further research
- Acknowledgements
- References
- Appendix

In the present paper, we summarize and further develop recent research in the estimation of the variance of stereological estimators based on systematic sampling. In particular, it is emphasized that the relevant estimation procedure depends on the sampling density. The validity of the variance estimation is examined in a collection of data sets, obtained by systematic sampling. Practical recommendations are also provided in a separate section.

### 1. Introduction

- Top of page
- Summary
- 1. Introduction
- 2. Examples of systematic sampling in stereology
- 3. The transitive methods
- 4. Regularity conditions
- 5. The relevant variance approximation depends on the sampling density
- 6. Measurement error
- 7. Resampling from a large sample
- 8. Analysis of 20 data sets
- 9. Discussion of earlier methods of analysis
- 10. An example and some practical recommendations
- 11. Open problems and further research
- Acknowledgements
- References
- Appendix

Systematic sampling is widely used in stereology, partly because it is a convenient way of introducing replications in the sampling and partly because the efficiency of the estimators based on systematic sampling is often much higher than that of estimators based on independent sampling. In Matheron (1965, 1971), the transitive methods for estimating the variance of estimators based on systematic sampling have been developed. Within the last decade, these methods have been introduced and applied in stereology, cf. Gundersen & Jensen (1987), Cruz-Orive (1989), Kellerer (1989), Matérn (1989) and Mattfeldt (1989).

In Cruz-Orive (1993), some further developments were presented. In particular, it was discussed how to treat the case where the measurement function is estimated with error. Also, examples were reported where the variance estimate suggested in Gundersen & Jensen (1987) was too high and this finding was also supported by derivations for ‘quasi-ellipsoidal’ objects. Recently, the use of transitive methods in stereology has been reconsidered by a French group of statisticians, cf. Souchet (1995), Kie^u (1997) and Kie^u *et al.* (1999). Their main contribution has been to formulate a set of sufficient conditions under which the relevant variance approximations hold.

In the present paper, we will present and further develop the work by the French group and discuss the practical implications. In Section 2, we describe some of the most common examples of systematic sampling in stereology. The transitive methods are presented in Section 3 (some technicalities are deferred to the Appendix), and the regularity conditions in Section 4. As a new development, we show in Section 5 that the relevant variance approximation depends on the sampling density. In Section 6, we discuss how to include measurement error in the estimation procedure. As an alternative to the transitive methods, resampling from a large sample is discussed in Section 7. The rest of the main paper is devoted to practical investigations of the methods. The analysis of a variety of data sets is discussed in Section 8 and some of the earlier methods of analysis are reconsidered in Section 9. Practical recommendations are provided in Section 10. Open problems and ideas for future work are presented in Section 11.

### 2. Examples of systematic sampling in stereology

- Top of page
- Summary
- 1. Introduction
- 2. Examples of systematic sampling in stereology
- 3. The transitive methods
- 4. Regularity conditions
- 5. The relevant variance approximation depends on the sampling density
- 6. Measurement error
- 7. Resampling from a large sample
- 8. Analysis of 20 data sets
- 9. Discussion of earlier methods of analysis
- 10. An example and some practical recommendations
- 11. Open problems and further research
- Acknowledgements
- References
- Appendix

In the present paper, we consider mainly the situation where the parameter of interest can be expressed as a one-dimensional integral

where *f* is a non-negative function defined on the real axis.

The general estimator of *Q* to be discussed is based on observations at a systematic set of points with spacing *T* between neighbour points. The estimator takes the form

where *U* is uniform random in an interval of the real axis of length *T*.

A well-known geometric example of this set-up is the case where *Q* is the volume of a bounded set *X* ⊆ *R*^{3} and *f*(*x*) the area of the intersection between *X* and a plane with a fixed orientation and position *x* ∈ *R*^{1}. Here, *Q*^_{T} becomes the famous Cavalieri estimator, which is named after the Italian mathematician Bonaventura Cavalieri, cf. Fig. 1. Another very important example is obtained by letting *Q* be the number of points in a finite set *X* ⊊ *R*^{3} and *f*(*x*) = *n*(*x*)/*h*, where *n*(*x*) is the number of points counted in a disector with position *x* ∈ *R*^{1} and height *h*. Then, *Q*^_{T} is the disector estimator of number based on systematic sampling. The counting principle was introduced as late as 1984 (Sterio, 1984). Another very useful design closely related to the disector design is the fractionator design, cf. Gundersen (1986).

In some cases, the parameter of interest *Q* may instead be expressed as a circular integral

where *f* is a non-negative function defined on [0, 2π). The estimator to be considered is then

where Θ is uniform random in [0, 2π/*n*) and *n* is a positive integer.

The 2D analogue to the Cavalieri estimator is used to estimate area from intersection length on a systematic set of parallel equidistant lines. In the planar case, however, we have an alternative. Thus, if *Q* is the area of a bounded planar region, this area may be expressed as a circular integral where *f*(θ) is equal to 1/2 times the so-called squared ray distance in direction θ, cf. Courant (1934, p. 275), see also Jensen (1998). The estimator of *Q* based on *n* systematic rays is the so-called nucleator, cf. Gundersen (1988). The 3D analogue may be derived from results in Courant (1936, p. 267).

### 3. The transitive methods

- Top of page
- Summary
- 1. Introduction
- 2. Examples of systematic sampling in stereology
- 3. The transitive methods
- 4. Regularity conditions
- 5. The relevant variance approximation depends on the sampling density
- 6. Measurement error
- 7. Resampling from a large sample
- 8. Analysis of 20 data sets
- 9. Discussion of earlier methods of analysis
- 10. An example and some practical recommendations
- 11. Open problems and further research
- Acknowledgements
- References
- Appendix

We shall in this section consider mainly the case where *Q* may be expressed as an integral along the real axis. The modifications for the circular case will be indicated when needed.

It may be shown that the variance of *Q*^_{T} can be expressed in terms of the covariogram *g* of *f*

The variance of *Q^*_{T} may therefore be interpreted as the difference between the integral of *g* and a discrete approximation of this integral. A similar formula holds in the circular case, cf. the Appendix.

As suggested originally by Matheron (1965, 1971), the difference (1) may be evaluated, using the so-called Euler–MacLaurin formula. The resulting approximation is valid for *T* 0 and is based on the behaviour of *g* near the origin. For instance, if *g* has an expansion near the origin as

cf. e.g. Cruz-Orive (1993, (2.4) and (2.5)). The actual calculation of the constants in (3) will be explained later in this section.

It was by the work of Souchet (1995), Kie^u (1997) and Kie^u *et al.* (1999) that it was first made explicit how the smoothness of *g* near the origin depends on the smoothness of the measurement function *f*. Under the regularity conditions stated in the next section, it may be shown that if *m* is the smallest integer such that *f* ^{(m)} may have jumps, then *g* is 2*m* times continuously differentiable, cf. Kie^u (1997).

Since the covariogram is a symmetric function,

*j* = 0, 1, …, 2*m*, which in turn implies that all uneven derivatives of *g* of order up to 2*m* − 1 are zero at the origin, i.e.

Note that, in particular, this implies that if *f* is continuous (*m* ≥ 1), then the derivative of *g* at the origin is 0 and therefore *b*_{1} = 0 in (2) and (3). It had earlier been believed that a zero slope of *g* at the origin could be expected only in special cases, e.g. for measurement functions associated with volume estimation of quasi-ellipsoidal objects, cf. Cruz-Orive (1993).

The parameter *m* will from now on be used to denote the smoothness class of *f*. In the case where *f* indicates the area of intersection between *X* ⊊ *R*^{3} and a plane, the smoothness of *f* should not be confused with the smoothness of *X*.

Furthermore, if *f* ^{(m)} jumps at *a*_{1}, … , *a*_{p}, then, cf. Kie^u (1997),

where + and − indicate limits from the right and left, respectively.

Let us illustrate the relationship between the smoothness of *f* and *g* by a simple example. Let

The function *f* is the area of the intersection between the unit ball in *R*^{3} and a plane at distance |*x*| from *O*. Note that the smoothness class of *f* is 1 and that *f* is not differentiable at *p* = 2 points, viz. *a*_{1} = −1, *a*_{2} = 1. By elementary, but somewhat tedious, calculations one may directly derive the covariogram in this case:

Using (5) or direct differentiation, we find that

For a measurement function of smoothness class *m*, the variance of *Q^*_{T} can for small *T* be approximated as follows

where *B*_{2m+2} is the Bernoulli number of order 2*m* + 2, cf. Abramovitz & Stegun (1964), Kie^u (1997) and the Appendix. Note that *g*^{(2m+1)}(0^{+}) is determined by the ‘transitions’ (jumps) of *f* ^{(m)}, cf. (5). Hence the name ‘transitive methods’, cf. Matheron (1971, p. 9).

The order of magnitude of the variance is *T*^{2m+2}. If the length of the support of the function *f* is ℓ, then the mean number of measurements inside the support is *n* = *ℓ/T* such that the variance is of the order *n*^{−(2m+2)}. The approximation (6) also holds in the circular case where ℓ = 2π, cf. the Appendix.

In Fig. 2, a collection of measurement functions is shown, together with a plot of the true variances and the approximate variances (6) as a function of *n*. The true variances have been determined by computer simulations. The results shown in Fig. 2 confirm the asymptotic theory.

Let us look in more detail at the cases *m* = 0, 1. For *m* = 0, it is well known that for small *T* we may, instead of (6), use the following approximation

and for small *T* we have the approximation

(Compare with (6), *B*_{4} = −1/30.) If *T* is small enough,

may be used as an approximation of *b*_{3} and we get

An unbiased estimator of the right-hand sides of (7) and (8) can be obtained by replacing *g*( *jT* ) by its estimate

In the circular case, the covariogram may also be estimated in this way, if a periodic continuation of *f* is used. Recall that in this case *T* = 2π/*n*.

### 5. The relevant variance approximation depends on the sampling density

- Top of page
- Summary
- 1. Introduction
- 2. Examples of systematic sampling in stereology
- 3. The transitive methods
- 4. Regularity conditions
- 5. The relevant variance approximation depends on the sampling density
- 6. Measurement error
- 7. Resampling from a large sample
- 8. Analysis of 20 data sets
- 9. Discussion of earlier methods of analysis
- 10. An example and some practical recommendations
- 11. Open problems and further research
- Acknowledgements
- References
- Appendix

The variance approximation (6) is valid for small *T*, or equivalently, for a large mean number of observations *n* = *ℓ/T*. Expressed in terms of *n*, the variance approximation is

In a real situation, the question is of course how large *n* needs to be before the approximation can be used.

The simulations shown in Fig. 2 for very simple measurement functions *f* indicate that already for *n* = 10, the approximation works well. However, some caution is necessary here. Let us consider an example. Let *X* ⊊ *R*^{3} be a modified ball of radius *R*. The modification is as follows: a cone with circular base is added at two opposite points on the surface of the ball, cf. the 2D illustration in Fig. 3. The total height of the cone is *e*, of which only *p*_{1}*R* is outside the ball, cf. Fig. 3. From each cone, a top of height *p*_{2}*e* is removed.

In Fig. 4, the true squared coefficient of error of the Cavalieri estimator (black dots in the figure) is shown as a function of the mean number of sections *n*. In the simulations, *p*_{1} = 0.006 and *p*_{2} = 0.003. The straight lines have slopes (from left to right) of −4, −6 and −2 and their levels have been calculated using an approximating measurement function at the given sampling density, i.e. (from left to right) the measurement function associated with a ball (smoothness class *m* = 1), a ball with two small circular cones (*m* = 2), and a ball with two small circular cones with removed tops (*m* = 0). The formulae for the lines have been derived from (9) and are as follows:

In (10), we have for simplicity used a support of length 2*R* and a volume of 4/3π*R*^{3}, also for the two modified balls, which of course does not affect the results for the choices of *p*_{1} and *p*_{2}, considered in the example.

In this remarkable example, the expected asymptotic behaviour, which is of order *n*^{−2}, is first reached for sample sizes of about 10^{4} or more. The intuitive reason for this is that the jumps of the measurement function are of negligible size and can only be ‘seen’ at the ‘resolution’ provided by a very dense sampling. In fact, the order of *n*^{−2} appears first when there is at least one observation in one of the missing tops of the cones. More interestingly, the variance for sample sizes up to 100, say, may very well be approximated by using the measurement function for a ball. Note that a sample size of 100 corresponds to about one observation in one of the two cones. The intermediate behaviour corresponds to the expected behaviour for a ball modified with two circular cones. So the variance may, as *n* increases, be described by a series of asymptotic behaviours, corresponding to approximating measurement functions which take into account more and more fine details of the actual measurement function.

In Fig. 5, another example of a similar type is presented. It concerns a whole series of measurement functions, indexed by a positive integer *k*. All measurement functions *f*_{k} have the unit interval as support and are obtained from *f*_{1} by a simple geometric procedure. The graph of *f*_{1} is a right triangle,

Now, the graph of *f*_{k} is obtained by cutting the graph of *f*_{1} into 2*k* − 1 pieces of constant breadth and rearranging the pieces such that the piece with the largest function value is placed at ½ and then the other pieces in descending order, symmetrically around ½, cf. Fig. 5. Evidently, *f*_{k} will converge, as *k* ∞, to *f*_{∞} which has a symmetric triangle as graph,

Formally, we can define *f*_{k} as follows. For *i* = 1, 2, … , 2*k* − 1,

All *f*_{k}s are of smoothness class 0 because they are not continuous while *f*_{∞} is of smoothness class 1. In Fig. 5, the squared coefficient of error of *Q^*_{T} is shown as a function of *n*, for measurement functions *f*_{k}, *k* = 1, 10, 501, 50001, ∞. As can be seen from Fig. 5, the expected asymptotic behaviour of the order of *n*^{−2} is first reached at a sampling density which depends on *k*. The larger *k* is, the higher sampling density or ‘resolution’ is required to discover that *f*_{k} is actually different from *f*_{∞}. For smaller sampling densities, the squared coefficient of error obtained for *f*_{k} can be approximated by that obtained for *f*_{∞}.

### 6. Measurement error

- Top of page
- Summary
- 1. Introduction
- 2. Examples of systematic sampling in stereology
- 3. The transitive methods
- 4. Regularity conditions
- 5. The relevant variance approximation depends on the sampling density
- 6. Measurement error
- 7. Resampling from a large sample
- 8. Analysis of 20 data sets
- 9. Discussion of earlier methods of analysis
- 10. An example and some practical recommendations
- 11. Open problems and further research
- Acknowledgements
- References
- Appendix

In this section, we discuss the quite realistic situation where the measurement function *f* is estimated with error. Let us suppose that we observe *F*_{k} at position *U* + *kT*, where

Furthermore, given *U*, we assume that *F*_{k1} and *F*_{k2} are stochastically uncorrelated for *k*_{1} ≠ *k*_{2} and

The estimator of *Q* becomes

The variance of oil *Q^^*_{T} can be decomposed into a component, due to variation of *U*, and a component, due to measurement error,

cf. Kie^u (1997). The parameter σ^{2} will be called the cumulative measurement error. Note that the component originating from the measurement error is of the order of *T*, while the other component is of the order of at least *T*^{2}, cf. (6).

The covariogram also plays a central role in this more general case where the measurement error is taken into account. Let

be the empirical covariogram. Then, it is easy to show that

Thus, except at the origin, the empirical covariogram is still an unbiased estimator of the theoretical covariogram of the measurement function.

As originally suggested by A. J. Baddeley (personal communication), the empirical covariogram may be bias-corrected at the origin if an estimate *S*^{2} of the error σ^{2} is available. The method of estimating the variance, applicable in the case of no measurement error, may then be used on the bias-corrected covariogram. For instance, for *m* = 0, 1, the resulting variance estimate becomes

The estimates (12) and (13) are obtained by combining (7) and (8) with (11).

Let us consider two situations where an independent estimate of the measurement error is possible. First, let *Q* be the number of points in a finite set *X*⊊*R*^{3} and *f*(*x*) = *n*(*x*)/*h* where *n*(*x*) is the number of points counted in a disector with position *x* ∈ *R*^{1} and height *h*. Let us suppose that *h* ≪ *T* such that (12) and (13) are expected to provide good approximations to the variance. Furthermore, suppose that not all points are counted in a disector but only a subset corresponding to a sampling fraction of ϕ. If ϕ ≪ 1, the count in the *k*th disector is expected to be distributed as

and *F*_{k} = *C*_{k}/(*h*ϕ). Then,

It follows that σ^{2} can be estimated by

Secondly, let us consider error due to point-counting. In this case, *f*(*x*) is the area of the intersection between a bounded set *X*⊊*R*^{3} and a plane with fixed orientation and position *x* ∈ *R*^{1}. We suppose that the areas are estimated, using a quadratic point grid with shortest distance *u* between neighbour grid points. Then, according to Matheron (1965, p. 88),

where *c* = (2π^{3})^{−1}ζ(3) + (6π)^{−1} = 0.0724, ζ is Riemann's zeta function and *B*(*x*) is the boundary length of the intersection of *X* with a plane with position *x*. Accordingly,

and σ^{2} may be estimated by

The boundary lengths *B*(*U* + *kT*) may in turn be estimated, using intersection counts with planar line grids.

Suppose that the section through *X* with position *U* + *kT* consists of *N*_{k} profiles of identical shape. If *B*_{ki} and *A*_{ki} are the boundary length and area of the *i* th profile in the *k* th section, *i* = 1, … , *N*_{k}, we thus assume that *B*_{ki}/√*A*_{ki} = α. Then,

The resulting estimate of σ^{2} becomes

where the sum is over all profiles in all sections and *F*_{profile} is the point-count estimate of the profile area; for a real example, see Geinisman *et al.* (1996). It should be noted that α refers to the shape of a single profile. The shape factor α may be judged from the nomogram in Gundersen & Jensen (1987, Fig. 18). In the latter figure, the shape factor is called *B/√A*.

If an estimate of σ^{2} cannot be constructed using special knowledge of the error source, the relevant information may be extracted using an extra term of the empirical covariogram, cf. Kie^u (1997). Using this technique, the resulting variance estimate becomes

In some cases, the set *X* is the disjoint union of a large number of small subsets with an individual size much smaller than *T*, like sparsely distributed cells, pancreatic islands or kidney glomeruli. In this situation, the empirical covariogram may appear to have an extra jump at the origin. This effect is sometimes called the small scale effect. In such cases formulae (14) and (15) may still be used.

### 7. Resampling from a large sample

- Top of page
- Summary
- 1. Introduction
- 2. Examples of systematic sampling in stereology
- 3. The transitive methods
- 4. Regularity conditions
- 5. The relevant variance approximation depends on the sampling density
- 6. Measurement error
- 7. Resampling from a large sample
- 8. Analysis of 20 data sets
- 9. Discussion of earlier methods of analysis
- 10. An example and some practical recommendations
- 11. Open problems and further research
- Acknowledgements
- References
- Appendix

In order to investigate the applicability to real data sets of the transitive methods, it is of course important to be able to judge the validity of the approximations. This can be done by comparing with a more laborious method based on resampling from a large sample.

In this section, we will briefly explain the resampling method and comment on its statistical properties. Let the large data set be collected with spacing *T*_{0} and consist of the measurements

and *U* is uniform random in [0, *T*_{0}). Let us suppose that *T* =thinsp;*k*_{0}*T*_{0}, where *k*_{0} is a positive integer. On the basis of the data, we may construct *k*_{0} estimates of *Q* with spacing *T*, viz.

The resampling estimate of Var [*Q*^^_{T}] is simply

Therefore, the resampling estimate is approximately unbiased if Var[*Q*^^_{T}] ≫ Var[*Q*^^_{T0}]. This will be the case if the subsample size is not too large compared to the size of the large sample.

We have studied the bias by simulation too. As measurement function, we have used the function shown in Fig. 6. (It is a smoothed version of the data set *V*_{5}, discussed in the next section.) On top of this measurement function we have independent multiplicative log-normal distributed noise. Given *U*, *F*_{k} is thus log-normal distributed with

More specifically, we have

where ξ_{k} and σ^{2} are chosen such that

In Fig. 7, the results of the simulations are shown for noises of various sizes. The noise is represented by τ. The fine spacing *T*_{0} has been chosen such that on the average 50 observations is in the large sample. Note that the bias is most important in the case where the noise is most substantial.

### 8. Analysis of 20 data sets

- Top of page
- Summary
- 1. Introduction
- 2. Examples of systematic sampling in stereology
- 3. The transitive methods
- 4. Regularity conditions
- 5. The relevant variance approximation depends on the sampling density
- 6. Measurement error
- 7. Resampling from a large sample
- 8. Analysis of 20 data sets
- 9. Discussion of earlier methods of analysis
- 10. An example and some practical recommendations
- 11. Open problems and further research
- Acknowledgements
- References
- Appendix

We have investigated the different methods of variance estimation in 20 data sets from a range of different 3D objects. There are 13 data sets, *V*_{1}, … , *V*_{13}, dealing with volume estimation, based on section data obtained by either magnetic resonance imaging or physical sectioning. In some of these data sets, the areas have been estimated by point-counting as described in Section 6. The remaining seven data sets, *N*_{1}, …, *N*_{7}, deal with 3D-number estimation from disector counts. In Fig. 8, some of the data sets are shown, together with their empirical covariograms. For ease of presentation, the data points as well as the points at which the covariogram is known have been joined by line segments. Note that this does not mean that the functions are actually continuous.

The different methods of estimating the variance are illustrated in Fig. 9, for all 20 data sets. The estimates of the squared coefficient of error are shown, using five different methods of estimation. The data sets contain a variable number of observations (from 30 to 202). In order to make the results comparable between data sets, the variance has been estimated for a spacing corresponding to roughly the same number of non-zero observations, viz. 10 observations. Thus, if the data set has *N* observations different from 0, with distance *T*_{0} between neighbour observations, then the variance is estimated for a sample with spacing *k*_{0}*T*_{0} where *k*_{0} is the integer closest to *N*/10. There are *k*_{0} of these samples, of average size *N*/*k*_{0}.

In each of the plots shown in Fig. 9, *R* indicates the resampling estimate as described in Section 7. The remaining four estimates are based on the transitive methods described in Section 6. Here, (*m*, +) and (*m*, −) refer to estimates of the squared coefficient of error under smoothness class *m*, using either an estimate *S*^{2} of σ^{2}, as described in Section 6, together with the empirical covariogram (+method) or the covariogram alone (−method). Thus, (1, +), (1, −), (0, +) and (0, −) refer to the formulae (13), (15), (12) and (14), respectively. Note that there is only one resampling estimate while *k*_{0} estimates for the remaining four methods. Note that for some data sets an estimate *S*^{2} of σ^{2} is not available and therefore the +methods cannot be used here.

For the data dealing with volume estimation, the smoothness class is expected to be *m* = 1, cf. Section 4, and this also seems to be the appropriate choice judged from Fig. 9, using the resampling estimates as yardsticks. For number estimation from counts, the suitable choice of smoothness class is less clear, in general. For the data shown here, the jumps of the measurement function are of negligible size compared to the sampling density, and we can see from Fig. 9 that *m* = 1 is an appropriate choice, in terms of both level and variation. Note also that the +method provides a lower variability than the −method and that the −method does not work very well for *m* = 0; see the percentage of negative estimates. As expected, estimates based on knowledge of the mechanism of noise, cf. (12) and (13), are more stable than those using the observations alone, cf. (14) and (15). We also tried to estimate *m* from the data sets themselves, using the methods suggested in Kie^u *et al.* (1999), but we got satisfactory results only for very long observation series.

### 9. Discussion of earlier methods of analysis

- Top of page
- Summary
- 1. Introduction
- 2. Examples of systematic sampling in stereology
- 3. The transitive methods
- 4. Regularity conditions
- 5. The relevant variance approximation depends on the sampling density
- 6. Measurement error
- 7. Resampling from a large sample
- 8. Analysis of 20 data sets
- 9. Discussion of earlier methods of analysis
- 10. An example and some practical recommendations
- 11. Open problems and further research
- Acknowledgements
- References
- Appendix

In Gundersen & Jensen (1987), it was discussed how to implement the transitive methods suggested by Matheron. It was in that paper tacitly assumed that *m* = 0 and, furthermore, the measurement error was not taken into account. The resulting variance approximation took the form

If instead *m* had been chosen to be 1, then the resulting approximation would have been a factor 20 lower, cf. (8),

In Fig. 10, the resampling estimates are shown, together with the estimates (17) and (18) of the squared coefficient of errors, obtained by erroneously neglecting the error. To the right, estimates of the squared coefficient of error for independent sampling are also shown. The figure thereby has the same layout as Gundersen & Jensen (1987, Fig. 7). The conclusion is that the estimate based on the assumption *m* = 0, cf. (17), matches the resampling estimate best, as was also proposed by Gundersen & Jensen (1987).

The reason is probably that two errors cancel. To be more specific, recall that under the model described in Section 6, the variance has two components

As indicated by earlier investigations in the present paper, *m* = 1 is often a suitable choice of smoothness class. By using (17) instead of (18), Var[*Q*^_{T}] is thereby overestimated and at the same time the error term *T*σ^{2} is neglected.

In Gundersen & Jensen (1987, Fig. 8), it was also shown in a collection of examples that the variance was of order *n*^{−2}. In fact, this order of magnitude can be explained as a mixture of magnitudes. Thus, if we assume that *m* = 1, then combining (9) and (19), we get

The variance thereby becomes a mixture of a term of order *n*^{−4} and a term of order *n*^{−1}.

### 10. An example and some practical recommendations

- Top of page
- Summary
- 1. Introduction
- 2. Examples of systematic sampling in stereology
- 3. The transitive methods
- 4. Regularity conditions
- 5. The relevant variance approximation depends on the sampling density
- 6. Measurement error
- 7. Resampling from a large sample
- 8. Analysis of 20 data sets
- 9. Discussion of earlier methods of analysis
- 10. An example and some practical recommendations
- 11. Open problems and further research
- Acknowledgements
- References
- Appendix

The method of estimating the coefficient of error that generally seems to work best for systematic sampling along an axis can be summarized as follows. Let the series of measurements be denoted by *f*_{1}, *f*_{2}, … , *f*_{n} and let

Let us suppose that the scale has been chosen such that the distance between neighbour sampling points is *T* = 1. If an estimate *S*^{2} of the cumulative measurement error is available, then the coefficient of error due to systematic sampling and measurement error can be estimated by

respectively. The estimate of the total coefficient of error becomes

Let us consider the example given in Table 1. Suppose that an estimate *S*^{2} is available, *S*^{2} = 7.8, say. Applying (20), (21) and (22), we obtain

Table 1. ** .** An example. The example illustrates nicely what is expected for stereological data from systematic designs of the type discussed in the present paper, viz. the random noise in the data will be responsible for the dominant part of the coefficient of error. Theoretical support for this statement can be found in Section 6. See also the example illustrated in Fig. 6.

For systematic designs as described in this paper, we may therefore make some broad recommendations (with one practical exception, discussed at the end of this section):

(i) Use a distance between sampling points (sections, typically) in the systematic design which yields *n* ∼ 10.

(ii) Use (20) and (22) to check that the contribution to *CE*_{total} due to systematic sampling is negligible.

(iii) Consider carefully the noise mechanism and reduce the noise variance to the desired level.

The recommendation (i) is based on empirical evidence collected from data sets like those analysed in Sections 7 and 8, but evidently need not be valid for artificial or man-made real objects.

It follows from the considerations and examples in Sections 5 and 8 that (20) is expected to be a realistic estimator of *CE*_{sys} in a broad class of cases, as recommended in (ii). Real 3D objects may have artefacts with corners, edges or flat faces. Such features are not expected to affect *CE*_{sys} if the sampling density is not high enough to reveal these artefacts, cf. Figs 4 and 5. If sections are taken perfectly parallel to a flat face, *CE*_{sys} will decrease only as *n*^{−2}, also for small *n*, cf. Fig. 11. If the face is tilted a bit, compared to the section direction, *CE*_{sys} will, however, decrease as *n*^{−4}. For instance, for the rectangle shown in Fig. 11, a tilt of about 2° ensures *n*^{−4} behavior already for *n* = 10.

An important practical example of this is a large natural object that has to be cut into slabs before embedding. To reduce *CE*_{sys}, it is a good idea to tilt the slab a bit before sectioning so that one typically obtains a few incomplete (and small) sections at the beginning and end of each slab.

Noise is a very different story. If the data are disector counts which, for very small sampling fractions, may be regarded as Poisson distributed then a *CE*_{noise} = 0.01 will require a count of about 10 000 particles. More realistic figures are counts of 100–200, providing a corresponding *CE*_{noise} of 0.1 to 0.07. For point-counting of areas, *CE*_{noise} for a total count of 200 points will rarely exceed 0.02–0.04, cf. Gundersen & Jensen (1987, Fig. 18).

If the noise mechanism is not well defined, the situation is more demanding. The available estimators of *CE*_{total} are those based on the variance estimators (14) and (15). Our experience based on the data analysis with these estimators is that they are less stable than those based on knowledge of the error mechanism. The best strategy may be to increase the sample size *n* well above 10.