Precision of the fractionator from Cavalieri designs

Authors

  • L. M. Cruz-Orive

    Corresponding author
    1. Departamento de Matematicas, Estadistica y Computacion, Facultad de Ciencias, Universidad de Cantabria, Avda. Los Castros s/n, E – 39005 Santander, Spain
      Professor Luis M. Cruz-Orive. Tel.: +34 942 201424; fax: +34 942 201402; e-mail: lcruz@matesco.unican.es
    Search for more papers by this author

Professor Luis M. Cruz-Orive. Tel.: +34 942 201424; fax: +34 942 201402; e-mail: lcruz@matesco.unican.es

Summary

A popular procedure to predict the variance of the fractionator consists in splitting the initial collection of fragments into two subsets, in order to use the corresponding particle counts (or any other pertinent measure), in the calculation. The current formula does not account for local or ‘nugget’ errors inherent in the estimation of fragment contents, however. Moreover, it does not account for the fact that the contribution of the variability between fragments or slices should rapidly decrease as the sampling fraction increases. For these reasons, an update to the formula is overdue. It should be stressed, however, that the formula applies to Cavalieri slices designs – its application for arbitrary partition designs is therefore not warranted.

1. Introduction

In design stereology the volume of an object, or the total number of cells or particles inside it, is usually estimated by means of a Cavalieri design. To predict the error variance of the corresponding unbiased estimators is therefore an important problem. The methods available for this prediction may be classified as follows (the references correspond to the most recent error variance prediction formulae).

  • (i)Systematic observations used as one subset in their natural, sequential order.
    • • Parallel systematic planes. The target is a volume. See Garcia-Fiñana & Cruz-Orive (2000, subsection 5.1). For an application including the point counting effect see Garcia-Fiñana et al. (2003, subsection 3.2).
    • • Systematic observations on the circle. Here the target may be curve length, surface area or volume (see Cruz-Orive & Gual-Arnau, 2002, subsection 4.1).
    • • Parallel systematic slabs for volume. See Gual-Arnau & Cruz-Orive (1998, subsections 5.4 and 5.5). For an application see McNulty et al. (2000).
    • • Parallel systematic slabs for number. See Cruz-Orive (1999, eq. 3.7). The local within-slices error is handled either by assuming that the particle centroids follow approximately a Poisson model (same article, eq. 3.12) or by the double disector technique (same article, eq. 3.19).
  • (ii) Fractionator approach: the initial systematic sample is split into two systematic subsets, and the error variance is predicted from the corresponding two estimators (Cruz-Orive, 1990).

The purpose of this paper is to update the fractionator variance predictor proposed in Cruz-Orive (1990) in the light of more recent results.

In the latter paper the target quantity was the number N of particles in a fixed and bounded subset X ⊂ ℝ3, for instance the number of neurons in a well-defined brain compartment. To estimate N the object X was first split into serial slices of a constant thickness. A systematic subset of slices was drawn with a known sampling fraction, and this subset was split into odd- and even-numbered slices (this design was originally proposed by Gundersen, 1986). After an arbitrary chain of further subsampling stages, the variance predictor was computed from the corresponding two particle number estimates.

The main shortcomings of the current approach may be summarized as follows.

  • 1The basic probes should be slabs (because the target is particle number), but the model used, leading notably to assumption (7) in Cruz-Orive (1990), was based on plane sections. No account was therefore taken of the fact that the estimator variance should rapidly decrease as the gap between consecutive slices decreases. We introduce the slice thickness effect making use of the result (5.16) from Gual-Arnau & Cruz-Orive (1998).
  • 2As mentioned above, the number of particles contained in the odd- and even-numbered slice subsets was estimated by further subsampling stages, leading to estimation or ‘local’ errors that were also not taken into account. This is achieved using the model described in Cruz-Orive (1999, eq. 3.4).

In addition, we generalize the design in Cruz-Orive (1990) in the following way.

  • 3We develop a general variance predictor (Eq. 2.27) for the case in which the initial slice sample is split not necessarily into two, but into n systematic subsets, (n = 2, 3, … ).

The mentioned general variance predictor is discussed in Section 4 for the particular case of n = 2 in two common situations, namely for ordinary Cavalieri disectors (Subsection 4.1) and for double disectors (Subsection 4.3). The latter have the advantage that the local errors can be estimated directly without resorting to particle arrangement models such as Poisson’s. The formulae are illustrated in each case with numerical examples.

2. Quantity of interest, sampling and estimation

2.1.Target quantity

The quantity to be estimated is the total measure N = µ(X) of a well-defined geometric subset of a fixed three-dimensional object X. In this paper we concentrate on the typical situation in which µ(·) is the counting measure, so that N represents total cell or particle number in X, but the general predictor of the estimator variance (Eq. 2.27) will apply to any other measure µ(·) such as volume, surface area or curve length in space.

Fix a convenient sampling axis Ox. We need the following definitions and notation.

  • Lx: plane normal to Ox at a point of abscissa x.

  • Lx(t): slab of thickness t > 0 at a point of abscissa x, namely the portion of space comprised between the two parallel planes Lx and Lx+t.

  • Lx(t) ∩ X: slice determined in the object X by the slab Lx(t).

  • µt(x) ≡ µ(Lx(t) ∩ X): measure of the slice Lx(t) ∩ X (e.g. number of cells associated with the slice, volume of the slice, etc.). If the slice is empty, then µ(inline image) = 0.

We assume that there exists an integrable function f: ℝ → ℝ+ such that

image((2.1))

in which case we have

image((2.2))

2.2.Sampling design

Stage 1.  The primary probe is a systematic series of slabs of fixed thickness t with a fixed period T (T ≥ t) (so that the gap between consecutive slabs is T − t ≥ 0), namely

{LUT+kT(t), k ∈ℤ},((2.3))

where ℤ denotes the set of integers and U ∼ UR(0, 1), namely U is a uniform random variable in the interval (0, 1) which ensures a ‘random start’ for the series. The primary sampling fraction is therefore

image((2.4))

Often the effective primary slab is a subslab of thickness h (0 < h ≤ t) entirely observable at a fixed depth within each primary slab, in which case the primary sampling fraction is

image((2.5))

(see also Subsection 2.5).

Stage 2.  The primary slab series (Eq. 2.3) is split into n systematic subsets. The jth subset consists of the slabs

{LUT+(j−1)T+knT(t), k ∈ℤ}, (j= 1, 2,…, n), (n= 2, 3,… ).((2.6))

The distance between slab midplanes in each of the n subsets, namely their period, is nT, and the corresponding sampling fraction is therefore τ/n.

Remark 2.1.  In Cruz-Orive (1990) the primary sampling fraction τ was denoted by 1/f1 (where f1 was implicitly assumed to be an integer), and n = 2.

2.3.Estimation of N without local errors

The total measure of the ‘Cavalieri sample’ corresponding to the primary ‘Cavalieri slices’ (Eq. 2.3) is

image((2.7))

and the first stage estimator

image((2.8))

is unbiased for N (Gual-Arnau & Cruz-Orive, 1998).

For each U, the totals of the n possible second-stage subsamples are

image((2.9))

and clearly,

Q=Q1+Q2+···+Qn.((2.10))

Consider a uniform random permutation of the integers {1, 2, … , n}, and, for simplicity, relabel the corresponding subtotals as {Q1, Q2, … , Qn}. Note that if U is uniform random in (0, 1) and j is an independent uniform random integer in the set {1, 2, … , n}, then the quantity U + j − 1 is uniform random in (0, n). Therefore, for the mentioned random permutation the slice subsets (Eq. 2.6) are Cavalieri subsets of period nT, and {Q1, Q2, … , Qn} are identically distributed (albeit not independent) Cavalieri subtotals satisfying

image((2.11))

where ��(Qj|Q) denotes the conditional expectation of Qj given Q. As a consequence,

image((2.12))

which generalizes Cruz-Orive (1990, eq. 6).

The corresponding second-stage unbiased estimators

image((2.13))

are therefore identically distributed and unbiased for N.

To model the variance of Q and of each Qj we assume that the function f defined by Eq. (2.1) is (0, p ≥ 1)-piecewise smooth (in the sense of Kiêu et al., 1999), whereby the extension term of the required variances can be predicted using eq. (5.16) from Gual-Arnau & Cruz-Orive (1998) (or eq. (II.3) from Cruz-Orive (1999) with m = 0). This choice is discussed at the end of Section 5. The estimator &#x004e;̂ comes from a Cavalieri sample of slices of thickness t and period T, whereas each estimator &#x004e;̂j comes from a Cavalieri subsample of slices of thickness t and period nT. Therefore, the mentioned results imply the following approximations:

image((2.14))

where b1 is the coefficient of the linear term of the model of the covariogram of f, which can be eliminated from the preceding two equations to obtain

image((2.15))

Bearing Eqs (2.8) and (2.13) in mind, and adopting the extended term variances as model variances for the present purposes, we obtain the following generalization of eq. (7) from Cruz-Orive (1990)

Var Q1= Var Q2=···= Var Qn2 Var Q,((2.16))

(over U ∼ UR(0, 1) and over the mentioned random permutations), where

image((2.17))

2.4.Estimation model with local errors

Usually each subtotal Qj will not be determined directly, but estimated by j, say, in further subsampling stages. In general we may write

image((2.18))

where ej is the random or ‘local’ error associated with the estimation of Qj. We assume that ej is independent of Qj for each j with the following properties,

  • (a) ��(ej) = 0.
  • (a) 
  • (c) Cov(ei, ej) = 0, i ≠ j.

Now Eqs. (2.11) and (2.12) may be replaced with

image((2.19))

and

image((2.20))

respectively.

On the other hand, recalling the properties of the local errors and Eq. (2.16) we obtain

image((2.21))

where

image((2.22))

is the sum of the local error variances. Therefore, Eq. (2.16) may be replaced with

image((2.23))

2.5.Remarks on the estimation of slice contents

Fixed subslab.  When the target parameter is particle number it is usual to observe a subslab of thickness 0 < h ≤ t at a fixed depth in the interior of each primary slab, in which case, τ =h/T (see Eq. 2.5). The contents of the corresponding subslice is then estimated by systematic optical disectors (West et al., 1991, 1996; Cruz-Orive et al., 2003). This approach is intended to reduce the effect of manipulation artefacts arising near the primary slice faces, but it may be especially sensitive to shrinkage and other deformation artefacts (Dorph-Petersen et al., 2001).

Random subslab(s).  In view of the preceding shortcoming it might be tempting to estimate the entire primary slice contents µt(x) by means of a thinner parallel subslab Lz(h) (0 < h ≤ t) hitting the slice Lx(t) ∩ X at a uniform random depth z. In this case we would have τ = t/T, as in Eq. (2.4), and Q would be estimated with the corresponding error (Subsection 2.4). Take U ∼ UR(0, 1). Then, it can be shown that

image((2.24))

is an unbiased estimator of µt(x). Note that the range of the random subslab hitting the slice is (x − h, x + t), which implies that the preservation of unbiasedness requires the measurement of the contents of grazing subslabs (namely of subslabs containing a slice face). This means that it is in principle not possible to estimate µt(x) without bias from a sampling subslab entirely inside the slice, although the bias should be small if the ratio h/t is small. (The preceding fact is analogous to the fact that in order to estimate the area of a plane figure with a random quadrat it is necessary to consider the positions of the quadrat hitting the boundary of the figure). For the sake of completeness we may consider the case in which the slice contents µt(x) is estimated by means of a parallel Cavalieri series of thinner subslabs of thickness 0 < h ≤ t and period d ≥ h. To avoid empty intersections we may assume that h ≤ d < t + h. In this case,

image((2.25))

is an unbiased estimator of µt(x). Again, if particle counts are restricted to subslabs entirely inside the primary slices, then the bias should be small if the ratio h/d is small. For instance in Geiser et al. (1990) the ratio h/d = 1/85 was adopted, and no grazing subslabs were obtained in the whole experiment.

2.6.Statement of the problem and main result

The final unbiased estimator of N reads

image((2.26))

which generalizes the estimator (2) from Cruz-Orive (1990). The problem is to predict Var Ñ in terms of the observed data {1, 2, … , n} and of a suitable estimator n of the sum (2.22) of the corresponding local error variances.

The main result of the present paper, derived in the next section, is the following estimator of the variance predictor,

image((2.27))

The following observations can be made.

  • 1The first term in the right-hand side of the preceding expression estimates the slices contribution, whereas τ−2n estimates the local error contribution to the total variance. If the quantity within square brackets is negative, we may replace it with zero.
  • 2If the primary sample consists of exhaustive serial slices, if namely τ = 1, then var(Ñ) = n, that is, the slices contribution is zero and only the local error contribution remains.
  • 3The corresponding square coefficient of error, namely
image((2.28))
  • reduces to eq. (16) from Cruz-Orive (1990) for and τ = 0, n = 2 and νn = 0.

3.Derivation of the predictor of Var Ñ

Our original derivation followed the pattern used in Cruz-Orive (1990) and took about one and a half pages. The following, much shorter, proof is due to Dr Marta Garcia-Fiñana (personal communication).

Our purpose is to obtain Var  using the properties (2.19) and (2.23). First, a direct application of the standard conditional variance decomposition (e.g. Rao, 1973; p. 97) yields

image((3.1))

Substituting the model results (2.23) and (2.19) and adding up from j =1 to n we obtain

image((3.2))

It only remains to find an unbiased estimator of the first term in the right-hand side of the preceding identity from the data. It is easy to verify (similarly as in Cruz-Orive, 1990, eq. 14) that

image((3.3))

which shows that the required estimator is inline image Substituting this estimator into the right-hand side of Eq. (3.2) and bearing in mind that Var Ñ = τ−2 Var , we obtain

image((3.4))

Finally, substituting θ by its value in Eq. (2.17), and then subtracting and adding τ−2n in the right-hand side of the preceding expression, we obtain the estimator (2.27).

4.Important special cases with n = 2

Commonly two Cavalieri subsets are obtained from the primary slice subsample. Thus, we concentrate on Eq. (2.28) with n = 2, namely

image((4.1))

where the first term in the right-hand side estimates the slices contribution, whereas the second term estimates the local error contribution. As indicated in Subsection 2.6, if the slices contribution is negative we may replace it with zero.

4.1.One-way disectors and Poisson model

We adopt the following conditions and notations.

  • (a) The target quantity N is the total number of cells or particles in a bounded non-random solid X.
  • (b) The primary Cavalieri sample consists of slices of thickness t and period T ≥ t. Within each slice a primary disector of thickness h ≤ t is observed at a fixed depth into the slice. The fraction τ = h/T (0 < τ ≤ 1) must be known.
  • (c) The primary Cavalieri sample is split into n = 2 Cavalieri subsets. The corresponding true particle subtotals are denoted by Qo and Qe, respectively. The subscripts ‘o’ and ‘e’ refer to odd- and even-numbered slices, respectively. This is only a convention, because in theory the Cavalieri subtotals are labelled according to a random permutation (Subsection 2.3).
  • (d) Neither Qo nor Qe are determined directly, but estimated by subsampling smaller systematic (usually optical) subdisectors equivalent to boxes of thickness h ≤ t and reference area period w (w ≥ 1) within each slice (West et al., 1991, 1996; Cruz-Orive et al., 2003). Then
image((4.2))
  • are unbiased estimators of Qo and Qe, respectively, whereinline image andinline imagerepresent the corresponding total numbers of particles counted in the optical subdisectors. The local error variances of o and e are denoted byinline imageandinline image respectively, and we set

image((4.3))
  • Because at present we lack a suitable predictor of ν2 for systematic boxes we may adopt the Poisson particle model, namely a model in which the particle centroids form – at least approximately – a homogeneous and isotropic Poisson point process. Under this modelinline imageis a Poisson random variable whose mean and variance are estimated unbiasedly by inline image itself, and similarly forinline image. Therefore, an unbiased estimator of ν2 is

image((4.4))
  • With the preceding conditions and assumptions, the unbiased estimator of N is

image((4.5))
image((4.6))

4.2.Poisson model: example

Problem and data.  In a neuroscience experiment it was desired to estimate the total number N of neurons of a given type in a well-defined compartment X of a mouse brain. The compartment was embedded in plastic and exhaustively cut into slices of t = 50 µm thickness. Every fourth slice was chosen with UR start, to yield a primary Cavalieri sample of period T = 50 × 4 = 200 µm. The primary sample was split into n = 2 subsamples consisting of the odd- and the even-numbered slices, respectively. Each slice from each subsample was in turn subsampled by UR systematic optical disectors of thickness h = 30 µm at a fixed depth inside the slice, the reference area subsampling period being w = 100. The odd- and even-numbered subsamples yielded the aggregate neuron countsinline imageand inline image, respectively.

Calculations.  From Eq. (2.5),

image((4.7))

Applying Eq. (4.5) we obtain

Ñ= (3/20)−1· 100 · (94 + 77) = 114 000 neurons.((4.8))

Lacking a specific prediction formula for the error variance due to the optical disector subsampling we tentatively use the aforementioned Poisson assumption. Applying Eq. (4.6) we obtain

image((4.9))

Thus, slice sampling contributes 16% of the total error variance, whereas disector subsampling contributes the remaining 84%. The corresponding ce values are 3.3% and 7.6%, respectively, and the total ce(Ñ) is 8.3%. (These data can be interpreted only in the context of the corresponding experiment involving several groups with several animals per group.) For the sake of comparison, the current formula (17) from Cruz-Orive (1990) yields just a total ce(Ñ) = 5.7%.

4.3.Double disectors

The conditions (a)–(c) of the sampling design described in Subsection 4.1 apply here without any change. The subdisectors, however, are physical (Sterio, 1984) rather than optical (see for instance Geiser et al., 1990, 1994), and can thereby be used as double subdisectors yielding the following counts:

  • inline image= total number of particles counted in the subdisectors from the odd numbered slices,

  • inline image= analogous count with the roles of the reference and look-up slices reversed,

  • inline imagethe absolute difference,

  • inline imagethe total number of particles counted in the odd-numbered slices,

and similarly for the even-numbered slices. As in Subsection 4.1 let w (w ≥ 1) represent the reference area period of the subdisectors. The unbiased estimators of Qo, Qe and N are

image((4.10))

respectively. Note that the estimators (4.2) no longer hold because nowinline image, inline imagerepresent ‘double’ counts. By contrast, the local error variancesinline imageandinline imagemay be estimated without resorting to a model because we have a pair of observations for each of them. Nonetheless, it is necessary to assume thatinline image and inline image are independent (which is reasonable only if the subdisectors are made small enough, so that they contain only a few particles), and similarly for inline image. In such case,

image((4.11))

is an unbiased estimator of ν2 (see also Cruz-Orive, 1999, eq. 3.18). With these premises Eq. (4.1) becomes

image((4.12))

4.4.Double disectors: example

Problem and data.  In a toxicology experiment it was desired to estimate the total number N of inhaled polystyrene microspheres (of a mean diameter of 6 µm) in a mouse lung X. Preliminary trials showed that the microspheres could be identified unambiguously via light microscopy in 10-µm-thick slices. The complete lung was therefore embedded in paraffin and exhaustively cut into 10-µm-thick slices. Every 20th slice, and the next, were chosen (with UR start) to obtain a primary Cavalieri sample of h = 10-µm-thick disectors (recall that two consecutive slices of thickness h make a physical disector of thickness h because the idea is to count only microspheres hit by one slab but not by the other), with period T = 20 × 10 = 200 µm. The primary sample was split into n = 2 subsamples consisting of the odd- and the even-numbered disectors, respectively. Each primary disector had to be subsampled by systematic subdisectors (because the final linear magnification required was about 500×) with a reference area period w = 9. Each subdisector was first analysed with the first slice as the sampling slice and the second slice as the look-up slice. The aggregate microsphere counts thus obtained in the odd- and even-numbered disectors wereinline imagerespectively. Then, each disector was analysed with the slice roles reversed, yielding inline image microspheres, respectively.

Calculations.  From Eq. (2.5),

image((4.13))

Next,

image((4.14))

whereby the third Eq. (4.10) yields

image((4.15))

whereas Eq. (4.12) yields

image( (4.16))

Thus, slice sampling contributes 35% to the total error variance, whereas disector subsampling contributes the remaining 65%. The corresponding ce values are 4.1% and 5.6%, respectively, and the total ce(Ñ) is 7.0%. (As in the example of Subsection 4.2, the preceding results have to be interpreted in the context of the corresponding experiment.) The current formula (17) from Cruz-Orive (1990) yields ce(Ñ) = 5.4%.

5.Discussion

The ‘splitting’ variance estimator (2.27) developed here is intended to replace the hitherto available version from Cruz-Orive (1990). The new version incorporates the known sampling fraction τ and the local error effects, and it allows the sample to be split into an arbitrary number of subsets.

It may be tempting to apply the estimator to splitting designs other than Cavalieri’s, as has been performed in the past with the older estimator (see for instance Gundersen, 2002, section 6). In such cases the results may be at best tentative, because Eq. (2.27) has been derived under relatively strict conditions (Subsection 2.2). For proper Cavalieri designs, however, Eq. (2.27) should perform similarly to those in Cruz-Orive (1999, section 3), because all such formulae are obtained under the same assumptions. For the time being it cannot be said which formula will the better in each particular case; Eq. (2.27) is at least simpler and it requires less book-keeping, because only subtotals are used (as opposed to individual slice counts in their original order). The optimal number of subsets is also an open question. The choice n = 2 is at least convenient.

An obvious improvement would concern the estimation of the aggregate local variance νn, at least for the systematic subsampling of ‘boxes’ (Subsection 4.1). A suitable predictor would make the current Poisson assumption unnecessary. It should be emphasized that the latter is more a statistical resource than an attempt to model the pattern of cell centroids in space. The double disector formula (Eq. 4.12) also uses the fairly strong assumption that the two particle counts from each disector are independent.

The adoption of a measurement function f that allows us to write Eq. (2.1) may be reasonable if the number of particles is very large, so that the number of particles in a infinitesimal slice Lx(dx) ∩ X can be represented by f(x)dx. Only in this way is it possible to use the theory in Gual-Arnau & Cruz-Orive (1998), which applies to the case in which f is a section area function and the target is a volume. It might be tempting to minimize the technical problems by modelling the smoothened slice function µt(x) itself instead of f. This would, however, be equivalent to adopting the alternative (A1) instead of (A2) from Gual-Arnau & Cruz-Orive (1998, p. 893). As a result, we would obtain the classical variance predictors not involving the sampling fraction τ, which would constitute a shortcoming.

The smoothness constant adopted for f was m = 0 (Subsection 2.3), which constitutes a fairly conservative choice. The choice of m will affect only the first multiplying fraction in the right-hand side of Eq. (2.27). The alternative m = 1 (which corresponds to a continuous f whose first derivative exhibits finite jumps) would lead to a more complicated multiplying fraction; its implementation would probably not be worthwhile, however, given that the usual target is particle number, and that the value of the variance estimator will often be orientating only (especially for non-Cavalieri designs).

Acknowledgements

The idea of considering an arbitrary number of n ≥ 2 Cavalieri subsets was suggested to me by Professor Neil Roberts in 1996. I also wish to thank Dr Marta Garcia-Finana for the improved derivation in Section 3, and the two referees for constructive remarks. This research was supported by the Spanish Ministry of Science and Technology I + D Project no. BSA2001-0803-C02-01.

Ancillary

Advertisement