Dense nuclei segmentation based on graph cut and convexity–concavity analysis

Authors

  • J. QI

    Corresponding author
    1. Department of Electrical Engineering and Computer Science, Northwestern University, Evanston, Illinois, U.S.A.
    2. Department of Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
    • Correspondence to: Jin Qi, Department of Electrical Engineering and Computer Science, Northwestern University 2145 Sheridan Road, Evanston, IL 60208, U.S.A. Tel: +01 773 742 1699; fax: +01 847 491 4455; e-mail: jqichina@hotmail.com; j-qi@northwestern.edu

    Search for more papers by this author

Summary

With the rapid advancement of 3D confocal imaging technology, more and more 3D cellular images will be available. However, robust and automatic extraction of nuclei shape may be hindered by a highly cluttered environment, as for example, in fly eye tissues. In this paper, we present a novel and efficient nuclei segmentation algorithm based on the combination of graph cut and convex shape assumption. The main characteristic of the algorithm is that it segments nuclei foreground using a graph-cut algorithm with our proposed new initialization method and splits overlapping or touching cell nuclei by simple convexity and concavity analysis. Experimental results show that the proposed algorithm can segment complicated nuclei clumps effectively in our fluorescent fruit fly eye images. Evaluation on a public hand-labelled 2D benchmark demonstrates substantial quantitative improvement over other methods. For example, the proposed method achieves a 3.2 Hausdorff distance decrease and a 1.8 decrease in the merged nuclei error per slice.

Introduction

In recent years, many image analysis approaches have been adopted for cell and nuclei segmentation from microscopic images. For simple cases, basic techniques such as thresholding (Otsu, 1978), simple region growing and edge or boundary detection are most popular. These methods provide a good separation of cells or nuclei from the background when the quality of inputting images is good and the contrast of images is high. Unfortunately, most microscopic images have a lot of noise and the contrast of the foreground (cell/nucleus) with the background is rather small whereas the variance within the foreground (cell/nucleus) is rather large. Other promising methods for cell or nuclei foreground segmentation include seeded watershed algorithms (Pinidiyaarachchi & Wahlby, 2005; Cheng & Rajapakse, 2009), a supervised machine learning method (Kong et al., 2011), a level set active contour model (Harder et al., 2011), multiscale analysis (Gudla et al., 2008), dynamic programming-based methods (Baggett et al., 2005; McCullough et al., 2008), Markov random fields (Luck et al., 2005) and graph-cut methods (Boykov & Funka-Lea, 2006; Danek et al., 2009; Al-Kofahi et al., 2010). We provide a short review on these methods in the following.

Seeded watershed segmentation (Pinidiyaarachchi & Wahlby, 2005; Cheng & Rajapakse, 2009) uses seeds as markers to try to overcome the oversegmentation and undersegmentation problems occurring in basic watershed algorithm (Vincent & Soille, 1991). In these methods, the watershed transformation is performed by defining a singular seed or marker for each cell or nuclei object. The drawback of the seeded watershed transformation is the difficulty of detecting seed points accurately. It often ends up with more than one seed per object or with objects containing no seed since images often contain noise and the gradient of the image is not strong. The errors arising from seed detection impact heavily the segmentation results.

A supervised machine learning method (Kong et al., 2011) segments the cell/nuclei regions from the other areas by classifying the image pixels into either cell or extra-cellular category using colour-texture features extracted at each pixel by local Fourier transform from a colour space. However, when nuclei/cells cluster heavily with each other, the local features at the pixels lying on the interfaces between nuclei are so similar with the features at the pixels inside the nuclei that a supervised machine learning method (Kong et al., 2011) is not able to separate touching nuclei.

Level set active contour method (Harder et al., 2011) segments foreground by computing image energy terms using intensity variance inside and outside the contour. Although this region-based approach provides strong robustness to noise and allows segmentation of cells with blurred edges, level set active contour method is incapable of separating touching nuclei since the intensity variance inside a clump is similar to the intensity variance inside each nuclei which constitutes the clump. By analyzing the edges at different spatial scales, multiscale techniques can handle images with weak edges and nonuniform intensity variations but they are unable to handle the edges of overlapping nuclei with minimal or no edge information (Gudla et al., 2008).

Dynamic programming-based methods (Baggett et al., 2005; McCullough et al., 2008) are semi-automatic methods for detection of optimal boundaries, which is defined as the path having the highest average intensity along its length compared to all other possible paths and obtained using dynamic programming. This semi-automatic method cannot be used to segment large number of nuclei from big stack of images due to the time burden caused by user interaction. Markov random fields based nuclei segmentation methods (Luck et al., 2005) suffer from undersegmentation and high false positives.

Graph-cut methods (Boykov & Funka-Lea, 2006; Danek et al., 2009; Al-Kofahi et al., 2010) received a lot of attention in recent years due to their robustness, reasonable computational demands and the ability to integrate visual cues, contextual information and topological constraints while offering several favourable characteristics like global optima, unrestricted topological properties and applicability to N-D problems. The core of the solution relies on modelling the segmentation process as a labelling problem with an associated energy function. This function is then optimized by finding a minimal cut in a specially designed graph. The method can be also formulated in terms of Maximum a posteriori estimate of a Markov random field. Although the graph-cut method is powerful to segment the nuclei foreground from the background, it requires effective initialization and cannot separate touching cells or nuclei (Kong et al., 2011).

An extension of the graph-cut framework incorporating a ‘blob'-like shape prior (Lou et al., 2012) is proposed to segment dense cell nuclei. This extension method detects seed points by computing the stable local maxima across scales and calculates an Euclidean distance map from the seed centres. Then by computing the gradient of the Euclidean distance map, it generates an appropriate vector field that hopefully originates from the inside of each nucleus and points towards its boundary. A so-called shape penalty is added to the graph-cut energy function assigning large penalty when the computed vector field is not compatible with the segmented contours. Although this shape prior based graph-cut method can segment and separate slightly touching nuclei, its performance depends heavily on the computed vector field. With high noise and overlapping nuclei, seeds cannot be detected robustly and the computed vector field cannot satisfy the requirement that it originates from the inside of each nuclei and points towards its boundary. The shape prior based graph-cut method (Lou et al., 2012) may not produce satisfactory results in case of tight nuclei clumps. However, we propose a new initialization method for the graph-cut model and a new nuclei cluster splitting method in this paper towards overcoming the disadvantages of graph cut in segmenting highly dense nuclei.

In general, the above general segmentation methods have the potential to segment the foreground from the background but they have limited success in separating clumped objects. Another widely used approach to deal with the cluster of touching cells or nuclei is based on the analysis of the concavities of clumps in a binary image, the result of the segmentation of background and foreground with any of the methods mentioned earlier. Concavity-based algorithms offer a simple and intuitive way of clump splitting. This method initially determines the dominant points on the contour, i.e. its concavities and convexities, based on which an optimal cut or split path is defined between concave points to minimize a cost function (Bai et al., 2009; Schmitt & Reetz, 2009; Yu et al., 2009; Indhumathi et al., 2011). A short review on these methods is given in the following.

Morphological structural models of touching cells are used to split cell clusters in Yu et al. (2009). After using a simple thresholding method, i.e. Otsu's method (Otsu, 1978), to segment cell foreground from background, morphological models of touching cells are recognized, segmentation points (dominant concave points) are detected, the number and related arc data of touching cells are found and touching cells are reconstructed.

Morphological structural models (Yu et al., 2009) can separate a few cells which slightly touch each other since the structural model can only simulate simple clusters with few cells. If the complexity of touching is high, this method is not good enough to split the touching cells (Yu et al., 2009).

A concavity scale space based decomposition method (Schmitt & Reetz, 2009) is proposed for cell cluster decomposition. First of all, concavity scale space based decomposition still uses the simple thresholding method of Otsu (Otsu, 1978) to segment inputting greyscale images and obtains binary images. Then, the concavity scale space based decomposition method analyzes the contour curvature on the scale space of Fourier coefficients and recognizes the relevant dominant concave points. Based on an optimized heuristic approach pairs of these dominant points are recursively matched since split objects do not possess concavity intrusions anymore. The advantage of this method is that holes within regions can be integrated into splitting paths. Compared to other methods the concavity scale space based decomposition approach has certain advantages regarding the partitioning of clustered regions with emerging complexity. A disadvantage seems to be the huge space of parameters that need to be adjusted (more than 10 parameters) (Schmitt & Reetz, 2009). It is rather elaborate tuning so many parameters for each new image.

The concave point and ellipse fitting cell splitting algorithm in Bai et al. (2009) also uses the simple thresholding method of Otsu (Otsu, 1978) to segment greyscale images and obtain binary images. The algorithm includes two parts: contour preprocessing and ellipse processing. The purpose of contour preprocessing is to smooth fluctuations of the contour, find concave points of the contour and use them to divide the contour into different segments. The purpose of ellipse processing is to process the different segments of the contour into possible single cells by using the properties of the fitted ellipses. Because concave points divide the whole contour of touching cells into different segments which have similar properties, the ellipse processing can separate the touching cells through ellipse fitting. However, in the case of large clusters of several dozens of cells, different segments of contours from different cells may have similar properties and the properties of the fitted ellipses may not be robust due to the short contour segments from small cells. Furthermore, the computation is time-consuming due to the recursive ellipse fitting process, especially in the case of high-density cell images. A large number of parameters need to be adjusted (seven parameters; Bai et al., 2009), making the process very elaborate.

The concavity based 3D cell cluster splitting method in Indhumathi et al. (2011) takes advantage of concavity analysis and interslice spatial coherence. This algorithm uses higher order statistics (Indhumathi et al., 2009) to segment 3D cell foreground from the background and obtains the 3D boundary of cells and their clusters. Concave points on the boundary are detected using some kind of masks. The first slice is selected as a reference slice and splitting paths between concave point pairs are found in the chosen reference slice using some criteria. Splitting paths in the next slice are searched between concave points in the neighbourhood of the splitting paths from reference slice. In other words, reference splitting paths constrain the search region in other slices. However, neighbouring slices may have different number of splitting paths since there may be different number of cells appearing in different neighbouring slices. In such a case, this algorithm will not work well.

In a nutshell, current general nuclei segmentation methods contain two main steps: nuclei foreground segmentation and cluster splitting. For nuclei foreground segmentation, simple methods such as thresholding (Otsu, 1978) will obtain undesirable results due to their noise sensitivity. Other advanced methods, such as the level set active contour model (Harder et al., 2011) and graph cut (Boykov & Funka-Lea, 2006; Danek et al., 2009; Al-Kofahi et al., 2010), are robust to noise. However, the level set active contour method (Harder et al., 2011) has numerical stability problems and is slow since it is based on an iterative procedure. This is a major issue when segmenting large number of image slices with high image resolution (e.g. 2048 × 2048 pixels). On the contrary, the graph-cut method (Boykov & Funka-Lea, 2006; Danek et al., 2009; Al-Kofahi et al., 2010) is fast and its solution is robust since it results in a global optimal solution. Hence in this paper the graph-cut method is chosen for nuclei foreground segmentation. Another reason for choosing this method is that it is prohibited to use morphological operations to process segmented binary images since there may be some small background areas appearing as holes within nuclei clumps in case of dense cell population. These holes are pretty useful when splitting clumps. Unfortunately, morphological operations have much impact on the small holes within nuclei clumps. However, a graph-cut algorithm requires effective initialization (Kong et al., 2011). Therefore, in this paper we propose a new initializing method based on the Poisson model of intensity of fluorescent microscopic images (Al-Kofahi et al., 2010).

For the clump splitting problem, the majority of current methods (Bai et al., 2009; Schmitt & Reetz, 2009; Yu et al., 2009; Indhumathi et al., 2011) are based on analysis of concavities of clumps in the segmented binary images. All of them detect all possible dominant concave points in binary images at one time and then search all possible splitting paths between detected concave points based on some criteria. However, detecting all possible concave points is a hard task in case of high nuclei density since the topology of nuclei clump consisting of dozens of nucleus (as the red clump in Fig. 2) is so complicated that simple criteria–based concave point detection methods fail to find the correct concave points. Furthermore, there are some correlations among concave points from the same clump. In other words, when one splitting path associating some concave points is used to split the nucleus clump into two parts, some other concave points in the two split parts may become nonconcave points. This means that the concave point list should be updated while the splitting process is going on. None of the current nucleus cluster splitting methods (Bai et al., 2009; Schmitt & Reetz, 2009; Yu et al., 2009; Indhumathi et al., 2011) can update the concave point lists along with the splitting process.

In general, some of the desirable attributes of algorithms for segmenting dense images are:

  1. The nucleus foreground segmentation method is robust to noise with no need to use morphological operations to further process the segmented binary images;
  2. Small holes within nuclei clumps can be integrated into clump splitting paths;
  3. It is not necessary to detect all possible concave points at one time and concave point list should be updated along with the splitting process;
  4. Splitting paths should not be constrained between concave points;
  5. The number of adjustable parameters should be as small as possible;
  6. Computation should be fast.

To address the above issues in this paper, we present a fully automatic method for dense nuclei segmentation. It is based on graph cut and uses the assumption of convex nuclei shape for the segmentation of touching nuclei.

Our method consists of two stages. In the first stage, nuclei (foreground) are segmented from background using a graph-cut algorithm without considering touching nuclei. A new initialization method based on the Poisson model of intensity of fluorescent microscopic images (Al-Kofahi et al., 2010) is proposed to initialize the graph-cut model [specifically, regional term math formula in Eq. (1)].

With the proposed initialization method, there is only one parameter to be tuned.

In the second stage, we propose a new simple and effective convexity–concavity analysis method (CCAM) for the splitting of clustering nuclei obtained in the first stage. CCAM finds the convex hulls of nuclei clusters and splits each cluster into two parts if a distance criterion (detailed in the Section ‘Graph-cut model initialization for nuclei foreground segmentation') is satisfied. The splitting process is repeated until the distance criterion is not met. Differently from other splitting methods our proposed method does not have to detect all concave points and there is no need to define concave points. Therefore, the proposed method does not constrain the splitting path to be between concave points. Furthermore, the small background area appearing as holes within nuclei clumps can be naturally integrated into our splitting paths.

In CCAM, there is only one distance threshold that needed to be tuned. We use an appropriate assumption in this paper that the shape of the nuclei is almost convex (slightly concave) when splitting touching nuclei. This convex shape assumption is more flexible and general than the elliptical nuclei shape model in (Bai et al., 2009).

Our method is novel since:

  1. A new method based on the Poisson model of intensity of fluorescent microscopic images (Al-Kofahi et al., 2010) is proposed to initialize the graph-cut model. The number of parameters in the graph-cut model is reduced to one.

  2. We propose a new simple and efficient convexity–concavity analysis based nuclei clump splitting method and an optimal splitting path searching method. Our splitting method does not depend on the detection of all concave points and splitting paths are not constrained between concave points. Furthermore, holes within nuclei clumps can be integrated into splitting paths.

    There is only one parameter needed to be tuned in the proposed nuclei splitting method.

  3. The graph-cut algorithm with our proposed initialization method has been combined with our proposed convexity–concavity analysis based splitting method for nuclei segmentation. The overall algorithm features two parameters: the balancing parameter λ in Eq. (1) in the graph-cut model and the distance threshold math formula in Section ‘Splitting of touching nuclei’ in the CCAM method. These two parameters are independent from each other. So the tuning of the two parameters can be done separately.

To the best of our knowledge, no prior similar work has been done for dense nuclei segmentation task.

The rest of the paper is organized as follows: Graph cut with our initialization method for nuclei foreground segmentation is presented in Section ‘Graph cut model initialization for nuclei foreground segmentation’. A detailed description of our convexity–concavity analysis is in Section ‘Splitting of touching nuclei’. Section ‘Experimental results’ describes segmentation results of drosophila eye nuclei in fluorescence microscopy images and the improved performance of our algorithm compared to other methods evaluated on a public database is demonstrated. Finally, we conclude our work in Section ‘Conclusion’.

Graph-cut model initialization for nuclei foreground segmentation

The initialization of the graph-cut model is important when used for segmentation (Kong et al., 2011) and initialization methods for the graph-cut model are very different for different applications due to its high flexibility (Boykov & Funka-Lea, 2006). In this section, we provide the details of how to construct the energy function and how to initialize the graph-cut model based on Poisson modelling of the image intensity (Al-Kofahi et al., 2010) for our nuclei segmentation task.

The first step in nuclei segmentation is to separate the foreground and the background pixels. Therefore, we will use graph cut with two terminals to segment the nuclei foreground. Consider an arbitrary set of data elements (pixels or voxels) math formula and some neighbourhood system represented by a set math formula of all (unordered) pairs math formula of neighbouring elements in math formula. Let math formula be a binary vector whose components math formula can be either ‘obj’ or ‘bkg’ (abbreviations of ‘object’ and ‘background'). Vector A defines a segmentation. The graph-cut algorithm minimizes the following energy function (Boykov & Funka-Lea, 2006):

display math(1)

where

display math(2)
display math(3)

and

display math

The coefficient math formula in (1) specifies the relative importance between the regional term math formula and the boundary term math formula. The regional term math formula assumes that the individual penalties for assigning pixel p to ‘object’ and ‘background’, correspondingly math formula and math formula, are given. For example, math formula may reflect how the intensity of pixel p fits into given intensity models (e.g. histograms, Gaussian model and Poisson model) of the object and background. math formula and math formula can be defined by the following statistical models:

display math(4)
display math(5)

where math formula represents the intensity at pixel p.

Choosing an appropriate statistical intensity models of the object and background in fluorescent microscopical images, i.e. math formula and math formula is the key step to initialize the regional term math formula in the graph-cut energy function math formula. In this paper, we use the mixture of two Poisson distributions (Pal & Pal, 1991; Al-Kofahi et al., 2010) to model the pixel intensity distributions math formula and math formula in fluorescence microscopic images. They are given by

display math(6)

and

display math(7)

where math formula and math formula are the so-called average number parameters of object and background distributions, respectively. They can be estimated automatically by maximizing the total entropy of the possible partitioning of the input image using only its histogram information (Pal & Pal, 1991). We omit the details for the estimation of these parameter since they can be found in (Pal & Pal, 1991). Therefore, there is no parameter that needs to be tuned for initializing the math formula term in the graph-cut energy function math formula.

An explanation is provided in (Pal & Pal, 1991) of why the Poisson model is more appropriate for modelling the pixel intensity distribution than other models (such as Gaussian model) in fluorescent microscopic images.

The math formula term comprises the ‘boundary’ properties of the segmentation A. Often, it is sufficient to set the boundary penalties from a simple function like (Boykov & Funka-Lea, 2006)

display math(8)

This function provides a large penalty for discontinuities between pixels of similar intensities, i.e. math formula. However, if the pixel intensities are very different, i.e. math formula, then the penalty is small. Here we want to segment the nuclei foreground. The value of δ can be set to the standard deviation of the image intensity, thus not requiring tuning of the parameter δ. In Section ‘Experimental results’, show that the segmentation result is not sensitive to the value of the parameter λ. Therefore, the graph-cut algorithm can be easily initialized by setting the only parameter λ to an appropriate value (the value of math formula has been used this paper when δ is set equal to the standard deviation of the image intensity). Figure 1 shows the drosophila eye nuclei segmentation result with multiple nuclei clusters using the graph-cut method proposed here. The binary image with nuclei foreground is shown on the right and the boundary contour is shown on the left overlying on the original greyscale image. It is easy to observe many touching nuclei. The largest connected component is highlighted in red colour. For better illustration, Fig. 1 only shows a subimage of one slice. Figure 2 shows similar segmentation results for a whole slice. Again, the largest connected component is shown in red colour. Apparently, the complicated topology associated with the large nuclei clump (in red colour in Fig. 2) makes it impossible to detect all possible concave prominent points or model the morphological shape of clumps in the nuclei splitting methods (Bai et al., 2009; Schmitt & Reetz, 2009; Yu et al., 2009; Indhumathi et al., 2011). Furthermore, since a large number of holes appears in the clump in Fig. 2, the nonhole-based splitting methods (Bai et al., 2009; Yu et al., 2009; Indhumathi et al., 2011) would definitely fail in splitting such clumps . It can be seen that the nuclei density is high in our microscopic images. This high nuclei density is a challenge for segmentation algorithms. To address this kind of nucleic clustering problem, a splitting algorithm is proposed in the Section ‘Splitting of touching nuclei’.

Figure 1.

Nuclei foreground segmentation using graph cut with multiple nuclei clusters: left: boundary contour of segmented image overlaying with original greyscale image; right: corresponding binary image, the largest connected component is highlighted in red.

Figure 2.

Nuclei foreground segmentation using graph cut with many clusters: left: original high nuclei density greyscale image; right: binary image segmented by graph cut; the largest connected component is shown in red colour.

Splitting of touching nuclei

We need to split the touching nuclei obtained with the graph-cut algorithm. For this purpose, a new simple CCAM is proposed here under the assumption that each nucleus has convex shape.

Let R be a simply connected region bounded by a rectifiable simple closed curve C (Fig. 3). The convex hull of R, denoted by math formula is simply connected and its boundary consists of a finite sequence of curves math formula. Each math formula is either an arc or a chord of C. Each chord denoted by math formula is a line segment of support of C. math formula is the curve with the same end points as math formula but belongs to the boundary of curve C. The point on curve math formula with the longest distance math formula to corresponding chord math formula is denoted by math formula. The point P is called the steepest concave point (SCP) if

display math(9)

In Fig. 3, the SCP is P2 and the subscript index of the SCP is math formula, i.e. math formula. Point math formula among all points on boundary curves math formula is called the nearest boundary point (NBP) if

display math(10)

where math formula is the distance between the SCP math formula and the boundary point P lying on boundary curves math formula. It should be noted that point P in Eq. (10) should not lie on the boundary curve math formula where the SCP math formula positions. The line connecting points SCP and NBP is an optimal splitting path, as shown in Fig. 3 in blue colour. If the largest distance math formula (distance from point math formula to chord math formula) corresponding to the steepest point math formula is less than a distance threshold math formula (in this paper we used math formula for our dense nuclei image with image size 1940 × 1940, math formula for CMU U2OS data set with relatively large cells and image size 1349 × 1030), then the task of splitting connected component R is completed. Otherwise, the connected component R is split into two parts, R1, and R2 along the optimal separating path. For each split part, repeat the above separating process until distance math formula is less than the threshold math formula.

Figure 3.

A connected component R and its convex hull. math formula and C6 are arcs of the boundary contour C enclosing R, whereas math formula are chords of C. Denote math formula with math formula respectively. math formula are boundary curves with the same ending points as chords math formula, respectively. math formula are the points on the boundary curves math formula which have the largest distance math formula from all points on math formula to chords math formula, respectively. P2 is called SCP. Optimal splitting path is the line segment in blue colour connecting the steepest concave point (SCP) and the nearest boundary point (NBP).

Our touching nuclei splitting algorithm based on our convexity–concavity analysis is as following

Algorithm 1: Nuclei Segmentation

(1) Use graph cut with our new initialization method to segment nuclei foreground and obtain a binary image.
(2) Collect all connected components from segmented foreground into list L.
(3) For each connected component R in list L, find the steepest concave point (SCP) math formula according to Eq. (9).
(4) If corresponding math formula is less than threshold math formula, delete this connected component R from list L and return to step 3. Otherwise, go to step 5.
(5) Find the corresponding nearest boundary point math formula according to Eq. (10).
(6) Split connected component R into two parts R1 and R2 along the optimal separating line segment connecting Point math formula and the nearest boundary point (NBP). Add the split parts R1 and R2 back to the list L.
(7) Iterate steps 3, 4, 5, 6 until list L is empty.

The effect of using this convexity and concavity analysis based splitting method is illustrated in Fig. 4 by splitting the largest connected component from Fig. 1. In our splitting process, the first four optimal splitting paths (line segments in red colour) found successively by our splitting method are shown in Figs. 4(B–D), where the convex hulls of connected components are shown with green polygons. The final split result is in Fig. 4(E). Figure 4(F) shows the contours of the split image in (E) overlaid over the original greyscale image. Figure 4 demonstrates that our convexity and concavity analysis based method can separate tight nuclei clustering accurately.

Figure 4.

Illustrating key steps of the proposed nuclei splitting method. Convex hulls of the connected components are shown in green polygons. (A) The largest connected component from Fig. 1. (B) The first optimal splitting path (highlighted by red colour) found by the first iteration in our splitting method. (C) The second optimal splitting path (highlighted by red colour) found by the second iteration in our splitting method. (D) The third and fourth optimal splitting paths (highlighted by red colour) found by the third iteration in our splitting method. (E) Final split image by our method. (F) Original greyscale connected component image overlaid by contours (in green colour) of split image in (E).

Compared to other concavity-based nuclei cluster splitting methods (Bai et al., 2009; Schmitt & Reetz, 2009; Yu et al., 2009; Indhumathi et al., 2011), our convexity–concavity based nuclei cluster splitting method proposed here features

  1. Integration of small wholes within nuclei clumps into cluster splitting paths;
  2. No need to define and detect all concave points which are hard to detect accurately in case of large nuclei clumps with complicated shape topology (as the red cluster shown in Fig. 2);
  3. Flexible splitting paths which are not constrained between concave points;
  4. Only one distance parameter;
  5. Fast computation.

Experimental results

We applied our approach to fluorescence microscopic images of the drosophila developing eye. The total number of slices is 78 and each has 1940 × 1940 pixels. These images only scan half of the developing eye and are taken with a 40× lens in a confocal microscope. Cell nuclei are labelled in red, and the protein of interest, which is called YAN, is labelled in green; yellow is where the two colours coexist. Our aim is to measure the abundance of the protein of interest (green channel). The idea is to use the red channel as a sort of map, to be able to unambiguously identify the nuclei positions and calculate fluorescence intensity of the green channel inside the nuclei. For this purpose, we need to segment the nucleus at first.

In Figs. 59, the segmented results for the third whole slice (s3) and ninth whole slice (s9) are shown to illustrate the effect of our segmentation algorithm proposed in this paper in dealing with high-density nuclei images. Original colour images and red channel images of s3 and s9 are shown in Figs. 5(A–D), respectively. The segmented results and split results are shown in Figs. 6(A–D), respectively. Figures 7 and 8 are showing contours (blue curves) and centres (red dots) of segmented nuclei overlaying on the original red channel images of s3 and s9, respectively. For more clarity, zoomed in parts of Figs. 7 and 8 are shown in Figs. 9(A and B). It can be seen from these figures that our proposed method can separate tightly touching nuclei within high-density nuclei images. Some undersegmented cases can be found in Figs. 7 and 8. These problems are due to the fact that the shape of the nuclei clump is already convex and the edges within nuclei clumps are rather weak (such as the one elongated clump consisting probably of three nuclei on the bottom left in Fig. 9 A). It is hard to split these kind of clumps successfully without considering their 3D information (information from upper or lower slices) or normal nuclei size information. We plant to extend our method proposed here by incorporating 3D information and information on the expected nuclei size into our algorithm in our future work.

Figure 5.

Sample slices s3 (A,C) and s9 (B,D) with high nuclei density, (A,B) original colour and (C,D) red channel images.

Figure 6.

Sample segmentation results of two whole slices (s3 and s9) with high nuclei density, (A,B) segmented; and (C,D) split images.

Figure 7.

Sample segmentation result of whole slice (s3) with high density of nuclei. Contours (green curves) and centres (red dots) of segmented nuclei overlaying on original red channel image of s3.

Figure 8.

Sample segmentation result of whole slice (s9) with high density of nuclei. Contours (green curves) and centres (red dots) of segmented nucleus overlaying on original red channel image of s9.

Figure 9.

Sample segmentation results of two whole slices (s3 and s9) with high density of nuclei. (A,B) zoomed in parts of Figs. 7 and 8, respectively.

Illustrating clearly the effectiveness of our algorithm in this paper, we show the segmentation results of subimages cropped from the same central area of original slices with slice number math formula. In Fig. 10, the binary images obtained by graph cut before splitting are shown on the left, the split binary images are shown in the middle and the boundary contours of the binary images in the middle are shown overlying on the original greyscale images on the right.

Figure 10.

Some examples of segmentation using the proposed method: left column: binary images segmented by graph cut before splitting; middle column: split binary images; right column: contours overlying on the original greyscale images.

The results in Fig. 10 demonstrate that the proposed algorithm can successfully segment nuclei from drosophila eye fluorescence microscopic images and split the ones clustering together.

We evaluate the performance of our algorithm proposed in this paper using another public benchmark data set since we have not yet had the digital ground truth of our microscopic fruit fly eye images for algorithm performance evaluation. Then we provide quantitative evaluation of the performance of our proposed method in this paper based on the 2D nuclei segmentation benchmark from the paper (Coelho et al., 2009). We use the same measures for qualitative evaluation, such as merge, split, spurious and missing and those for quantitative evaluation like the Rand index and the Hausdorff distance. Figure 11 shows our segmented result on the ‘difficult’ image (U2OS data set) suggested in the paper (Coelho et al., 2009). Figures 11(A and B) are the original greyscale image and the ground truth, respectively. Our segmented and split result is shown in Fig. 11(C). Figure 11(D) shows the comparison of contours of nuclei found by hand (green contours) and our algorithm (red contours). It is showing that our algorithm in this paper can perfectly deal with the difficult segmentation problem rising from many clustered nuclei since the green contours (automatic method) are pretty much consistent with the red contours (by hand).

Figure 11.

Comparison of our segmentation result with hand segmentation on the ‘difficult’ example given in the paper Coelho et al. (2009). (A) Original greyscale example; (B) Ground truth (red curves); (C) Segmented result with our algorithm; (D) Green contours (golden standard) and red contours (by our algorithm) of nuclei.

In the paper by Coelho et al. (2009), several published algorithms have been implemented and tested with the data set U2OS. To compare our algorithm in this paper with these algorithms in the paper of Coelho et al. (2009), we run our algorithm on the same benchmark data set U2OS and list the values of the above-mentioned metrics in Table 1. The values of the different metrics for all algorithms except ours are directly from the paper of Coelho et al. (2009). The results in Table 1 show our superior performance in most categories. In particular, due to the global optimality property of graph cut and convex nuclei shape assumption, our method provides significant quantitative improvement [3.2 Hausdorff distance decrease and 1.8 decrease in the merged nuclei error per slice over the best method (called ‘Merging Algorithm') in the paper by Coelho et al. (2009)].

Table 1. Comparison of segmentation algorithms. The values for all methods except ours are directly from the paper by Coelho et al. (2009)
AlgorithmRI (%)JIHausdorffNSD(× 10)SplitMergedSpuriousMissing
AS manual952.49.70.51.61.00.82.2
RC threshold922.234.81.21.12.40.35.5
Otsu threshold922.234.91.21.12.40.35.6
Mean threshold962.226.51.01.33.40.93.6
Watershed (direct)911.934.93.613.81.22.03.0
Watershed (gradient)901.834.63.07.72.02.02.9
Active masks872.1148.35.510.52.10.410.8
Merging algorithm962.212.90.71.82.11.03.3
Our algorithm962.59.70.61.40.30.93.3

The nuclei segmentation algorithm in this paper is implemented in Matlab language without any optimization in programming. We evaluate the time performance of our method proposed in this paper using Intel(R) Core(TM) i7 CPU 880@3.07 GHz with 64 bit windows OS. Only one core (eight cores available) is used based on single thread programming. The time used for graph cut and convexity–concavity splitting is listed in Table 2 for each greyscale image (red channel from colour image) with size 1940× 1940 pixels and size 1349× 1030 pixels, respectively. From the table, it can be seen that both graph cut and convexity–concavity splitting procedures approximately take less than 1.6 and 0.9 s for each larger size image and smaller size image, respectively. The time performance can be much better when parallel programming skills are used. There are only two parameters, λ and math formula, in our whole segmentation algorithm in this paper. λ specifies the relative importance of the regional term math formula versus the boundary term math formula in graph-cut energy function in Eq. (1). The bigger the value of λ, the more the method looks like a thresholding algorithm. On the contrary, the smaller the value of λ, the smoother the segmentation result. The segmentation results with different λ values are shown in Fig. 12. It can be known from Fig. 12 that the segmentation results are not sensitive to the value of λ. In other words, the value of λ can be set in the large range [5 30] for randomly chosen small patch from our microscopic image. Actually, for the whole stack of our fluorescent microscopic images, the range of the valid value of λ is as large as the range [5 30]. We set the parameter λ with the same value 20 for our drosophila eye images and CMU U2OS data set in this paper. The value of the parameter math formula indicates the degree to which the split components are convex. Therefore, its value depends on the size of the objects and how convex the final split objects are. In our drosophila eye images, the density of nuclei is higher than the density of cells in the CMU U2OS data set and the average size of nuclei in our images is smaller than the average size of cells in the CMU U2OS data set. Therefore, we set math formula equal to 5 and 25 for our dense nuclei images and the CMU U2OS data set, respectively. All of the parameters and their values are listed in Table 3.

Figure 12.

Different foreground segmentation results for the same patch with different values of the parameter λ.

Table 2. Time performance of our algorithm consisting of graph cut and CCAM for our images and CMU U2OS data set
 Graph cutCCAM
Data set(seconds/slice)(seconds/slice)
Ours (size 1940× 1940)1.42561.5764
CMU U2OS (size 1349× 1040)0.78340.8423
Table 3. Parameters and their values for different data sets
Data setλmath formula
Ours205
CMU U2OS2025

Conclusion

In this paper, we proposed a new nuclei segmentation method combining graph cut with the convex nuclei shape assumption for drosophila eye fluorescence microscopic images. We propose a new simple and effective CCAM for the splitting of clustering nuclei. A new initialization method based on the Poisson model of intensity of fluorescent microscopic images (Al-Kofahi et al., 2010) is proposed to initialize the graph-cut model. There is only one robust parameter λ to be estimated in the initialization of the graph-cut energy function. A new CCAM is proposed to split touching cells or nuclei. Our proposed cluster splitting method in this paper does not have to detect all concave points and there is even no need to define concave points. Therefore, our method proposed here does not have to constrain the splitting path to be between concave points. Furthermore, the small background area appearing as holes within nuclei clumps can be naturally integrated into our splitting paths. This convex shape assumption is more flexible and general than the elliptical nuclei shape model in the paper (Bai et al., 2009).

Experimental results demonstrate that our approach in this paper is a promising one for segmentation of nuclei from drosophila fluorescence microscopic images with high nuclei density. Performance evaluation performed on a public hand-labelled 2D benchmark shows that our method outperforms all the segmentation methods implemented in the paper (Coelho et al., 2009) and obtains substantial quantitative improvement over these methods. For instance, the algorithm proposed here achieves a 3.2 Hausdorff distance decrease and a 1.8 decrease in the merged nuclei error per slice. However, some undersegmented cases can be found in Figs. 7 and 8. These problems are caused by the condition in which the shape of the nuclei clump is already convex and in which the border edges of a nucleus within the nuclei clumps are weakened (such as one elongated clump consisting probably of three nuclei at the bottom left in Fig. 9A). It is hard to split these kind of clumps successfully without considering their 3D information (information from upper or lower slices) or normal nuclei size information. We will extend our method proposed here by incorporating 3D information and information on normal nuclei size into our algorithm in the future research. In general, the flexibility of initializing the graph-cut energy function using the Poisson model of intensity of fluorescent microscopic images (Al-Kofahi et al., 2010) makes our algorithm applicable to many other areas. Especially, our algorithm proposed here can be extensively used in segmenting objects with convex shape, such as particles in the physical, chemical, material and biological research sciences.

Acknowledgements

We thank the anonymous reviewers for their critical comments and suggestions which improve this paper a lot.

Ancillary