Combining intensity, edge and shape information for 2D and 3D segmentation of cell nuclei in tissue sections


Dr Carolina Wählby. E-mail:,


We present a region-based segmentation method in which seeds representing both object and background pixels are created by combining morphological filtering of both the original image and the gradient magnitude of the image. The seeds are then used as starting points for watershed segmentation of the gradient magnitude image. The fully automatic seeding is done in a generous fashion, so that at least one seed will be set in each foreground object. If more than one seed is placed in a single object, the watershed segmentation will lead to an initial over-segmentation, i.e. a boundary is created where there is no strong edge. Thus, the result of the initial segmentation is further refined by merging based on the gradient magnitude along the boundary separating neighbouring objects. This step also makes it easy to remove objects with poor contrast. As a final step, clusters of nuclei are separated, based on the shape of the cluster. The number of input parameters to the full segmentation procedure is only five. These parameters can be set manually using a test image and thereafter be used on a large number of images created under similar imaging conditions. This automated system was verified by comparison with manual counts from the same image fields. About 90% correct segmentation was achieved for two- as well as three-dimensional images.

1. Introduction

Segmentation is the process by which an image is divided into its constituent objects, or parts, and background. It is often the first, most vital, and most difficult step in an image analysis task. The result of the segmentation usually determines eventual success of the final analysis. For this reason, many segmentation techniques have been developed by researchers worldwide, and there exist almost as many segmentation methods as there are segmentation problems. Automatic segmentation of cell nuclei from two- and three-dimensional (2D and 3D) images of cells in tissue allows the study of individual cell nuclei within their natural tissue context. Compared with manual methods based on drawing the outlines of the nuclei with a mouse, automatic methods need far less interaction, and the result is more objective and easily reproduced. Automation also increases the amount of data that can be processed. Once the objects of interest have been delineated, a large number of descriptive features can be extracted from the objects (see Rodenacker & Bengtsson, 2003, for an overview).

The difficulties in automatic segmentation of images of cells in tissue produced by fluorescence microscopy usually have three causes. First, the image background intensity is often uneven due to autofluorescence from the tissue and fluorescence from out-of-focus objects. This unevenness makes the separation of foreground and background a non-trivial task. Second, intensity variations within the nuclei further complicate the segmentation as the nuclei may be split into more than one object, leading to over-segmentation. Third, cell nuclei are often clustered, making it difficult to separate the individual nuclei.

A very simple and often used method for image segmentation is thresholding, based on histogram characteristics of the pixel/voxel intensities of the image. For an overview of thresholding techniques, see Sahoo et al. (1988). In order to obtain a satisfactory segmentation result by thresholding, a uniform background is required. Many background correction techniques exist (e.g. Lindblad & Bengtsson, 2001; Krtolica et al., 2002), but they may not always result in an image suitable for further analysis by thresholding. The transition between object and background may be diffuse, making an optimal threshold level difficult to find. At the same time, a small change in the threshold level may have a great impact on further analysis. Feature measures such as area, volume and mean pixel intensity are directly dependent on the threshold. Adaptive thresholding, i.e. local automatic thresholding, can be used to circumvent the problem of varying background or as a refinement to a coarse global threshold (Ortiz de Solorzano et al., 1999). The problems of segmenting clustered objects and choosing a suitable threshold level for objects with unsharp edges will, however, remain. Thresholding does not have to be the final step in the segmentation procedure. An intensity threshold can be used as a start for further processing, e.g. by the morphological operations presented below and/or visual inspection.

Instead of defining the border between object and background by a threshold in the image intensity, similarity between neighbouring pixels/voxels in the original image, or in the corresponding gradient image, can be used. This is usually referred to as ‘region growing’, as starting regions grow by connecting neighbouring pixels/voxels of similar grey-level. Many region growing algorithms result in over-segmented images, i.e. too many object regions are formed. In Pavlidis & Liow (1990), region growing is combined with region merging based on edge information, and in Lin et al. (2000), the images are preprocessed with an adaptive anisotropic filter to reduce over-segmentation. The adaptive anisotropic filter reduces noise in homogeneous regions while sharpening discontinuities. Using these methods one is still left to face the clustering problem, i.e. finding separation lines when no intensity variation is present, which is very prominent in cell nuclei segmentation. Another approach, described in Adams & Bischof (1994), is to let the regions grow from predefined small regions, known as seeds. Each region in the resulting segmented image will contain exactly one of the starting seeds. Both manually marked seeds and an automatic seeding method are described. The problem with this approach to cell nuclei segmentation is that it is very difficult to construct a seeding method that puts exactly one seed in each nucleus when the nuclei are clustered and/or have internal intensity variations.

A popular region growing method, which has proved to be very useful in many areas of image segmentation and analysis, is the so-called watershed algorithm. The method was originally suggested by Digabel and Lantuéjoul, and extended to a more general framework by Beucher & Lantuéjoul (1979). Watershed segmentation has then been refined and used in very many situations (for an overview see Meyer & Beucher, 1990; Vincent, 1993). The main difference between the watershed method and ordinary region growing is that the watershed method works per intensity layer instead of per neighbour layer. If the intensity of the image is interpreted as elevation in a landscape, the watershed algorithm will split the image into regions similar to the drainage regions of this landscape. The watershed borders will be built at the crests in the image. In a gradient magnitude image, water will start to rise from minima representing areas of low gradient, i.e. the interior of the objects and the background, and the watershed borders will be built at the maxima of the gradient magnitude. However, if watershed segmentation is applied directly to the gradient magnitude image, it will almost always result in over-segmentation, owing to the intensity variations within both objects and background. Instead of letting water rise from every minimum in the image, water can be allowed to rise only from places marked as seeds. Seeded watersheds have previously been described (e.g. Meyer & Beucher, 1990; Beucher, 1992; Vincent, 1993; Lockett et al., 1998; Landini & Othman, 2003). Fully automatic foreground seeding is tricky, and using morphological filtering, one often ends up with more than one seed per object, or objects containing no seed at all. More than one seed per foreground object will in many methods (i.e. Meyer & Beucher, 1990) result in background seeds passing through foreground components, leading to incorrect segmentation results. Many seeded watershed segmentation methods are therefore based on manual seeding (e.g. Lockett et al., 1998), requiring extensive user interaction. In Stoev & Straâer (2000), a way of using a seeded watershed for extracting a single, manually seeded region of interest is presented. It uses four merging criteria to overcome the over-segmentation problem. The threshold values needed for the merging step are all calculated from the marked seed in the region of interest. Merging to reduce over-segmentation is also described in Najman & Schmitt (1996) and Lemaréchal & Fjortoft (1998).

Edge-based segmentation techniques, which try to connect local maxima of the gradient image, often run into problems when trying to produce closed curves. That is why region-based methods, such as region growing or watershed, that group similar pixels are often used. Another group of methods that do not have the problem of being required to produce closed curves are methods related to snakes or active shape models. From a rough marking of the border or a seed inside the object of interest a curve expands until it finds a strong edge. The function describing the expansion consists of different energy terms attracting the curve to edges. The approach with expanding curves has been used for cell nuclei segmentation (Garrido & Pérez de la Blanca, 2000; Ortiz de Solorzano et al., 2001). The problems with this method are defining suitable energy terms and, again, the problem of constructing automatic seeding methods, which are restricted to one unique seed per nucleus.

Cell nuclei are usually convex and do not show narrow waists. The shape of the cell nuclei itself can therefore be used for a priori modelling, or as an object-specific feature, in the search for a suitable segmentation method discriminating between single nuclei and clusters of nuclei. In Yang & Parvin (2002), a 3D blob segmentation method based on elliptic feature calculation, convex hull computations and size discrimination is described. A careful choice of a scale parameter is needed, and the edges of the resulting objects will not necessary be aligned with the edges of the nuclei. In two dimensions (see Yang & Parvin, 2003), internal cell compartments and noise were localized with this method. The compartments were then interpolated using harmonic cuts to enable cell segmentation and cell cluster division based on segmentation of the vector field from the regularized centroid transform. A restricted convex hull computation is used for slice-wise 3D segmentation in Beliën et al. (2002). The restricted convex hull draws a straight line between all contour pixel pairs, which are separated by a distance less than or equal to a constant r. Background pixels found on these lines are added to the object and define the restricted convex hull. The deepest points in the restricted convex hull decide the location of end-points of separation lines. The information obtained per slice is later joined to construct 3D objects. Watershed segmentation applied to distance-transformed binary images (usually binarized through thresholding) is useful for separating touching objects that are approximately convex (see Malpica et al., 1997; Ortiz de Solorzano et al., 1999). In Krtolica et al. (2002), similar separating lines between touching objects are found in a slightly different way. The distance image is thresholded, creating a new binary image consisting of shrunk versions of all the objects. Dividing lines are thereafter defined as the skeleton of the background of this binary image.

None of the above described methods will alone produce a satisfactory result on 2D and 3D images of fluorescence-stained nuclei in tissue if (1) the nuclei are clustered, (2) the image background is variable and (3) there are intensity variations within the nuclei. Here we present a method in which we combine the intensity information, the gradient information and the shape of the nuclei for improved segmentation. The method requires few input parameters and gives stable results. Morphological filtering on the intensity image is used for finding object seeds. Morphological filtering of the gradient magnitude image is used for finding background seeds. Seeded watershed segmentation is then applied to the gradient magnitude image, and region borders are created at the crest lines in the gradient image. More than one seed in an object means that the object will be divided into more than one region, i.e. we will start with over-segmentation. After watershed segmentation we merge neighbouring objects and only keep those borders that correspond to strong edges. This step will also remove objects with poor contrast. If the nuclei are tightly clustered, no edge is present where they touch, and they will therefore not be separated. Objects found by the first steps of the segmentation process are further separated on the basis of shape. Shape-based cluster separation using the distance transform is applied to all objects found by the previous steps, but only those separation lines that go through deep enough valleys in the distance map are kept. In the method presented, a true object must contain at least one seed, its borders to neighbouring objects must have a sufficiently strong gradient magnitude average and it must have a reasonable shape. The method is independent of image dimensionality, and has been tested on six 2D epifluorescent images and one 3D confocal image of cell nuclei. It will very likely prove useful also for many other types of images.

2. Description of method

We illustrate the segmentation method and the result after each processing step on a 2D image of fluorescence labelled cell nuclei (see Fig. 2). Exactly the same methodology is used in the 2D and 3D images, except that each picture element or voxel will have a larger number of neighbours to take into consideration in the higher dimension.

Figure 2.

(A) Part of an original 2D fluorescence microscopy image of a slice of a tumour. (B) The gradient magnitude of A. (C) The foreground seeds found by extended h-maxima transformation. (D) The background seeds found by extended h-maxima transformation of the gradient magnitude image followed by removal of small components. (E) Result of seeded watershed segmentation. (F) The result after merging seeded objects based on edge strength. Poorly focused objects are also removed in this step. (G) The distance transform of the objects in the segmented image. The brighter the intensity the further the pixel is from the background or a neighbouring object. (H) Watershed segmentation of the distance transform before merging. (I) Final segmentation result based on intensity, edge and shape information.

2.1. Preprocessing

The 2D images, each of size 1024 × 1024 pixels, were smoothed by a 3 × 3 Gauss filter. No other preprocessing was necessary.

The voxels of the 3D image were non-cubic, and the fluorescent signal was degraded in the optical sections furthest into the tissue owing to light attenuation. These two direction dependencies in the image data had to be corrected before the segmentation method could be applied. The degradation of light was compensated by first determining the function describing the attenuation of the signal. This was done by setting a foreground/background threshold for the image. The same threshold was used in all optical sections and the mean intensity of the signal above the threshold was calculated. The mean signal per slice was plotted against slice number, and the attenuation was shown to be linear within the information-carrying slices of the image volume. A straight line was fitted to the linear part of the data in a least-square sense, and all the image slices were compensated by multiplication using Iz,compIz * m/(k * z + m), where Iz,comp is the compensated version of slice z, Iz is the original slice z, m is the mean intensity value where the fitted line cuts the y-axis, and k is the slope of the line describing the attenuation. The compensation was applied to all voxels in the image, not only those above the threshold; however, because it is a multiplicative compensation, values close to zero before compensation will be close to zero also after compensation. This makes the method less sensitive to the initial threshold than an additive method, such as that described in Umesh Adiga & Chaudhuri (2001).

The voxel size of the original data was 98 nm in the x and y directions and 163 nm in the z direction, and the image volumes had a size of 512 × 512 × 99 voxels. To obtain cubic voxels, the image was rescaled by a factor 98/163 in the x and y directions, resulting in an image volume of 307 × 307 × 99 voxels. Rescaling was performed by using linear interpolation for positioning in the input image when picking values for the output image (no intensity interpolation was performed).

The spatial resolution in the depth direction was not as good as in the lateral directions. If the resolution in the depth direction is low, the gradient magnitude will be affected. This can cause problems in 3D segmentation using gradient information. Some of the effect may be reduced using a direction-dependent gradient operator. No such compensation was needed in the image example presented here. After compensation for light attenuation and non-cubic voxels, the 3D images were filtered using a 3 × 3 × 3 Gauss filter, just as in the 2D case.

2.2. Seeding

Seeds marking probable foreground and background regions are planted in the image. These seeds serve as starting points in the watershed algorithm applied to the gradient magnitude image.

The images used in this paper contain bright objects on a darker background. Hence, each object of interest contains at least one local intensity maximum. We define foreground seeds in the original image using the extended h-maxima transform (Soille, 1999). The extended h-maxima transform filters out the relevant maxima using a contrast criterion. All maxima whose heights are smaller than a given threshold level h are suppressed (Fig. 1). The extended h-maxima transformation can be implemented using sorted pixels and searching for local maxima with a given contrast compared with the local neighbourhood. The only parameter h is related to the height of the structures and no size or shape criterion is needed. A low h will result in many seeds, often more than one seed per object. A high h will result in fewer seeds, and some objects may not get a seed at all. Owing to a subsequent merging step based on gradient magnitude (described below), we use a rather low h value to ensure that each object gets at least one seed. The choice of h turns out not to be a critical operation, because a range of values yield satisfactory results, as long as each object gets at least one seed. All foreground seeds are uniquely labelled using connected component labelling.

Figure 1.

h-maxima in one dimension. The local maxima in the 1D image, or intensity profile, are marked by stars, and represent the result of h-maxima transformation using h = 1. If h = 2 or h = 3, the result will be the dark or light grey regions, respectively. A low h will result in many small regions, whereas a larger h will result in fewer, but larger regions.

Just as the objects can be seeded by extended h-maxima in the original image, the background can be seeded by extended h-minima in the original image, i.e. local minima deeper than a certain depth h. This method of seeding the background was used in an earlier version of our method (Wählby & Bengtsson, 2003). Owing to generally higher background intensity close to fluorescent objects, this way of seeding hardly generates any background seeds at all close to the objects. We choose to define our new background seeds in the gradient magnitude image instead, because an uneven background in the original image will not be strongly apparent in the gradient magnitude image.

We calculate the gradient magnitude image, as described below, and define our background seeds as the extended h-minima in the gradient magnitude image. Because the interiors of cell nuclei also will be local minima in the gradient magnitude (Fig. 2B), we have to discard all connected extended h-minima components smaller than a certain size, s, to make sure that no object pixels/voxels are set as background seeds. This way of using the gradient magnitude image to seed the background generates background seeds evenly distributed in the image, even if an image has a non-uniform background. A foreground seed may overlap with the background seed. If this is the case, the overlap region is set to belong to the foreground seed. It will, however, most likely be merged with the background in the subsequent merging step as its borders will have a very weak gradient magnitude. The final foreground and background seeds can be seen in Fig. 2(C,D).

The seeds are important for the final result of the segmentation. Non-seeded objects will never be found. More than one seed per object does not, however, necessarily lead to over-segmentation in the final result. It is therefore better to have too many than too few seeds. A more exact seeding will, by contrast, result in faster segmentation.

2.3. Calculating the gradient

The seeds of the objects and the seeds of the background should grow and meet where the gradient magnitude image has a local maximum. The magnitude of the gradient expresses the variation of local contrast in the image, i.e. sharp edges have a high gradient magnitude, whereas more uniform areas in the image have a gradient magnitude close to zero. The local maximum of the gradient amplitude marks the position of the strongest edge between object and background. There are many different approximations of the gradient magnitude of an image in mathematical morphology. A commonly used approximation of the gradient is the Beucher gradient (Meyer & Beucher, 1990), which is obtained by assigning to each pixel x the difference between the highest and the lowest pixels within a given neighbourhood of x. Using the Beucher gradient, structures inside the nuclei give rise to edges similar in strength to those of the outer edges of the nuclei. A better result is obtained using Sobel operators. The Sobel operators take the positions of the differences in the local neighbourhood into account when approximating the gradient. The Sobel operators are a set of linear filters for approximating the gradients in the x, y (and z) directions of the image. In two dimensions, the weights of the 3 × 3 operator for approximation of gradients in the y direction can be seen below to the left. The operator for approximation of gradients in the x direction looks the same, but is rotated 90°. For approximation of 3D gradients, a set of 3 × 3 × 3 Sobel operators is needed. The z-slices of the Sobel operator for approximation of gradients in the 3D y direction is shown below to the right. The operator is simply rotated for approximation of gradients in the x and z directions.


The gradient magnitude image is approximated by taking the sum of the absolute value of the convolution of the image with the different Sobel operators (Sonka et al., 1999). The result in two dimensions can be seen in Fig. 2(B).

2.4. Watershed

Watershed segmentation can be understood by interpreting the intensity image as a landscape. A hole is drilled in every minima of the landscape, and the landscape is submerged in water. Water will then start to fill the minima, creating catchment basins. As the water rises, water from neighbouring catchment basins will meet. At every point where two catchment basins meet, a dam, or watershed, is built. These watersheds are the segmentation of the image. Watershed segmentation can be implemented with sorted pixel lists (Vincent & Soille, 1991). This implies that the segmentation can be performed very rapidly.

In the method described by Vincent & Soille (1991), pixels/voxels that are located at an equal distance from two catchment basins become part of the watershed lines. This means that we sometimes get thick watershed lines, leading to pixels/voxels that are not part of any catchment basin. In our implementation of the watershed algorithm, we keep track of the pixels/voxels that are ambiguous, i.e. located at an equal distance from two or more catchment basins, and let water flow around them. As a last step of the watershed, the most common neighbouring label is assigned to the ambiguous pixels/voxels, and every pixel/voxel is thereby made part of a catchment basin. This is necessary for the subsequent merging step described below.

In our seeded version of the watershed segmentation, water will rise from pixels marked as seeds, as well as from non-seeded regional minima found in the image. Separating dams, or watersheds, are built only between catchment basins associated with different seeds. As soon as the water level of a seeded catchment basin reaches the weakest point of the border towards a non-seeded regional minimum, it will be flooded. The water will continue to rise until each seeded catchment basin in the gradient magnitude image meets another seeded catchment basin.

2.5. Merging regions with weak borders

If too many seeds are created in the seeding step, some objects will have more than one seed. These objects will be over-segmented after the watershed algorithm, because each seed results in one region. However, if two seeds are in the same object, the magnitude of the gradient at the region boundaries will usually be low. Associating region boundaries with border strength requires some careful definitions. The strength of a border separating two regions should be calculated in such a way that the strength of the border between regions A and B is the same as the strength of the border between B and A. This is achieved by traversing the image of the segmented catchment basins once. If the current pixel/voxel has a label that is different from that of a ‘forward’ neighbour (two edge and two point neighbours in the 2D case, and three face, six edge and four point neighbours in the 3D case), the pixel/voxel intensities from the corresponding two positions in the gradient magnitude image are retrieved. The brighter of the two is chosen to represent the border strength between the two neighbouring pixels/voxels and saved in a table for border data. We choose the brightest value because it represents the strongest border value. If a pixel/voxel has several forward neighbours with different labels, one value will be added to the table of border data for each label.

The strength of the complete border between two regions can be measured in many different ways. Previously, the length of the border between two objects has been used to decide if neighbouring objects should be merged or not (Umesh Adiga & Chaudhuri, 2001), but then the gradient magnitude is not taken into consideration. A simple measure is to define the strength of a border as the weakest point along the border. This is often used for reducing over-segmentation resulting from watershed segmentation. However, many correctly segmented objects are then merged, owing to single weak border pixels or weak border parts originating from locally less steep gradients. Another simple measure, which is less sensitive to noise and local variations, is the mean value of all pixels along the border, i.e. the mean value of all maxima of the border pairs between two objects.

The mean value of the border of each merged object must be updated after merging. This is done by adding the information of border pixel sum and border length from the merged object to the new, larger, object and its neighbours. Instead of defining the border strength as a mean value, one might consider the median or some other percentile. This would, however, complicate the updating after merging, making the merging step much slower. The merging is continued until each remaining object border is stronger than a threshold te.

A strong border means that the object is well focused. When merging based on border strength is performed, not only are over-segmented objects merged, but also poorly focused objects are merged with the background and disappear. This may be of use if well-focused objects are important in the further analysis of fluorescent signals (e.g. Wählby et al., 2001). In this case, a rather high threshold te is suitable.

2.6. Separating clusters using shape

Tightly clustered nuclei will most likely not show a strong border where they touch, and they will thus not be properly separated by watershed segmentation. We do, however, as a result of the previous steps, obtain a correct segmentation of the clusters from the background. Cell nuclei are usually convex, and can be separated from each other on the basis of shape, as previously described in, for example, Vincent (1993) and Roysam et al. (1994), where binary images are used as input. We use the seeded and edge merged watershed result as binary input to a distance transformation, and thereafter apply watershed segmentation on the distance image. The distance transform of a binary image assigns to each object pixel the distance to the closest background pixel. In order to maintain the borders between already separated nuclei, the eight-connected outer edge of each object is removed prior to distance transformation, so that background pixels will be present between all different objects. We have used the 3–4 distance transform for 2D images (Borgefors, 1986), and the 3–4−5 distance transform for 3D images (Borgefors, 1996) (see Fig. 2G). Taking the inverse of the distance image, the distance maxima serve as regional minima for watershed segmentation. Catchment basins will be built around every distance maxima, as seen in Fig. 2(H). As the nuclei are fairly round, the most prominent distance maxima will coincide with the centres of nuclei, and catchment basin borders will coincide with the more narrow waists of the clusters. The discrete distance image may, however, contain too many local maxima, resulting in over-segmentation. Merging based on the weakest border pixel is applied to reduce this over-segmentation. Only those borders whose minimum strength is greater than ts are kept, corresponding to a maximal allowed shape variation of the object. Over-segmentation may also be reduced by removing small shape variations in a preprocessing step, such as morphological opening of the distance image, or by using extended h-maxima as seeds for the watershed. The resulting regions would, however, probably be similar, and region-based merging is usually faster than using morphological filters. The result can be seen in Fig. 2(I).

3. Experiments

3.1. Specimen preparation and imaging

The segmentation method was tested on 2D and 3D images of tissue sections from samples of routinely fixed and paraffin-embedded cervical carcinoma (3D), and carcinoma of the prostate (2D). The tissue sections for 2D analysis were cut at a thickness of 2–4 µm, and the tissue section for 3D analysis was cut at 30 µm. The sections were put on Superfrost Plus microscope slides, incubated at 58 °C overnight, and then stepwise re-hydrated using xylene and graded alcohols. The slides were briefly washed in a washing buffer (0.05 mm Tris-HCl, pH 7.6, 0.3 mm NaCl, and 0.02% Tween 20) before and after staining.

For 2D imaging, DNA staining was performed by incubating the tissue sections with 10 µm DAPI solution for 5min. The sections were briefly washed in buffer and mounted in DABCO mounting medium. Images were acquired using an epifluorescent microscope from Delta Vision (Applied Precision, Seattle, WA, U.S.A.) equipped with a cooled monochrome CCD camera (Photometrics, Tucson, AZ, U.S.A.). A Zeiss Plan-Neofluar 63 × NA 1.30 lens (Carl Zeiss GmbH, Oberkochen, Germany) was used, resulting in a pixel size of 106 × 106 nm and an optical resolution of approximately 200 nm in the x and y directions. The Delta Vision system has an image deconvolution option; however, this was not used because a good deconvolution result requires a large number of optical sections. The images that were used are thus equivalent to images produced by a standard fluorescence microscope.

For 3D imaging, DNA staining was performed using a solution containing approximately 0.3 µg mL−1 propidium iodide dissolved in washing buffer. The stained slide was mounted in DABCO mounting medium and imaged using a laser scanning confocal microscope from Leica Microsystems (Heidelberg, Germany). An HCX PL APO 40× NA 1.25 oil UV lens was used, resulting in a pixel size of 98 nm in the x and y directions and 163 nm in the z direction. The 568-nm laser line was used for excitation and all emission longer than 587 nm was collected. No deconvolution was applied.

3.2. Implementation and testing

An Alpha personal workstation 433 running hp Tru64 UNIX was used for development and testing of the software. The implementation was done in C++ and the code was part of IMP, an image processing platform developed at the Centre for Image Analysis (Nordin, 1997). Once the five input parameters hfg, hbg, s, te and ts were set, the experiments needed no human interaction. The speed of the segmentation depends on image size and the number of objects in the image. The full preprocessing and segmentation of a 2D image with 1024 × 1024 pixels containing 130 cells takes less than 2 min. 3D segmentation is more time consuming but, on the other hand, manual marking of 3D objects is not really an option, because there is no natural way of marking 3D objects on a 2D screen.

4. Results

4.1. 2D results

The algorithm was tested on six different 2D images, each containing 87–133 cells, 689 cells in total. Input parameters were – foreground seeding, extended h-maxima: hfg = 3; background seeding, extended h-minima: hbg = 6 and size limit: s = 5000; edge merging: edge strength threshold te = 10; shape-based cluster separation: merging threshold ts = 5. All in all, only five input parameters are needed for the full segmentation task, and all the input parameters were set for a single test image, and then used for all images. The results are given in Table 1. Depending on the border strength threshold te, cells that are poorly focused may be lost. Over-segmented cells in the table refer to cell nuclei that have been divided into more than one object, plus any extra objects found in the image. By extra objects, we mean objects that were not marked as cell nuclei by the visual inspection. They are usually poorly focused cell nuclei or debris. The numbers in parentheses represent over-segmented nuclei only. Under-segmented cells refer to clusters of nuclei that have not been properly separated, plus nuclei that were not detected at all. The numbers in parentheses represent under-segmented clustered nuclei only. The segmentation result achieved by the described method was compared with manual counts from the same image fields. Little variation in the result shows that the method is robust.

Table 1.  Segmentation results in 2D images (in number of nuclei) after each of the segmentation steps in section 2.
ImageActual no. of nucleiIntensity: seeded watershedEdge: merging weak bordersShape: cluster separation
Over segm.Under segm.Over segm.Under segm.Over segm.Under segm.
1117 36 (12) 5 (4) 0 (0) 9 (7) 1 (1) 4 (2)
2109 40 (23)11 (10) 1 (0)22 (18) 1 (0)14 (10)
3133 44 (19) 3 (2) 1 (0)18 (9) 4 (1)14 (5)
4147 72 (27) 4 (2) 2 (1)10 (4) 2 (1) 6 (0)
5 87 35 (7) 2 (2) 1 (0)15 (9) 1 (0) 9 (3)
6 96 33 (9) 9 (6) 0 (0)11 (6) 0 (0) 9 (4)
Total:689260 (97)34 (26) 5 (1)85 (53) 9 (3)56 (24)
Correct: 57% (82%) 87% (92%) 91% (96%) 

4.2. 3D results

The algorithm was also tested on a volume image, containing 90 cells in total. The result can be seen in Fig. 3. Input parameters were – foreground seeding, extended h-maxima: hfg = 6; background seeding, extended h-minima: hbg = 6 and size limit: s = 40 000; edge merging: edge strength threshold te = 15; shape-based cluster separation: merging threshold ts = 3. Visual inspection of the 3D result was performed by observing each slice of the result and comparing it with the original image. Due to extensive seeding, almost every nucleus was over-segmented after the first segmentation step (seeded watershed). Out of the 90 cell nuclei, 82 were correctly segmented, four were over-segmented and four were under-segmented, resulting in 91% correct segmentation. No nuclei within the tissue slice were missed, but some thin pieces of nuclei near the edge of the tissue slice were lost. In addition to the 90 cells, 18 fluorescing objects not resembling proper cell nuclei were also found. Most of them can, if needed, be discarded on the basis of their small size.

Figure 3.

(A) Maximum intensity projection of the 99 z-slices of a 3D image of a cervical carcinoma tumour. (B) Surface rendering (using marching cubes) of the final 3D result. Objects on the border of the image have been removed for better visualization. They are, however, included in the results. A densely packed cell layer is clearly visible. Note that regions too small to possibly be true cell nuclei can be removed easily by further processing. (C,D) Close-ups showing cases where our method separates clustered cell nuclei.

5. Discussion

The method described in this paper combines intensity information with edge strength and shape. Very little preprocessing is needed, even if the background variation in the image is large. Poorly focused objects in 2D images are automatically removed, as their edge strength is low. The number of missing objects can be reduced by not allowing seeded foreground objects to merge with the background. This will, however, mean that the poorly focused objects are not removed. The input parameters are at present manually set for a test image, and the same parameters are thereafter used for fully automatic segmentation of images created under the same imaging conditions. As only five input parameters are required, this can be done quickly. Automatic parameter approximation may be possible; for example, optimal values for the extended h-maxima and minima, and edge strength, all depend on the intensity dynamics of the image, and the size limit for the background seeds, as well as the shape-based merging threshold, are directly dependent on the size of the nuclei. Methods for automatic parameter approximation are subjects for future work. When a new type of specimen is imaged, adjustment of input parameters will only be necessary if image dynamics or nuclear size changes.

The segmentation method can be useful for many different segmentation tasks where a simple foreground/background threshold is not sufficient. Further processing, such as removal of nuclei that are damaged or under-segmented, by a size threshold, or more advanced statistical methods, may improve the result.

When applied to 3D images, the main difficulty is to acquire input images that have a sufficient gradient in the z-direction. If the gradient in the z-direction is very weak, a gradient filter that takes this into consideration may improve the result.

Automatic segmentation is not only faster than manual segmentation, it is also observer-independent and reproducible. Manual correction of segmentation errors is a common complement to improve the results of a fully automatic segmentation algorithm, resulting in 100% correct results with little observer bias, in a short time.


Human material was acquired under the approval of the Ethical Human Research Committee at the Karolinska Hospital, Sweden (approval no. 01-367). We thank Stefan Gunnarsson for providing the 3D images and Joakim Lindblad for valuable help and advice.