Automatic detection and analysis of cell motility in phase-contrast time-lapse images using a combination of maximally stable extremal regions and Kalman filter approaches



    1. Biocenter Oulu, University of Oulu, Finland
    2. Mika Kaakinen and Sami Huttunen contributed equally to this work
    Search for more papers by this author

    1. Mika Kaakinen and Sami Huttunen contributed equally to this work
    2. Department of Computer Science and Engineering, University of Oulu, Finland
    Search for more papers by this author

    1. Department of Biological and Environmental Science, Nanoscience Center, University of Jyväskylä, Finland
    2. Department of Mathematical Information Technology, University of Jyväskylä, Finland
    Search for more papers by this author

    1. Department of Biological and Environmental Science, Nanoscience Center, University of Jyväskylä, Finland
    Search for more papers by this author

    1. Department of Computer Science and Engineering, University of Oulu, Finland
    Search for more papers by this author

    Corresponding author
    1. Biocenter Oulu, University of Oulu, Finland
    2. Oulu Center for Cell-Matrix Research, Department of Medical Biochemistry and Molecular Biology, Institute of Biomedicine, University of Oulu, Finland
    • Correspondence to: Lauri Eklund, Oulu Center for Cell-Matrix Research, Department of Medical Biochemistry and Molecular Biology, Institute of Biomedicine, University of Oulu, Finland P.O. Box 5000, FI-90014 Oulu, Finland. Tel: +358 294 486073; fax: +358 8 537 6115; e-mail:

    Search for more papers by this author


Phase-contrast illumination is simple and most commonly used microscopic method to observe nonstained living cells. Automatic cell segmentation and motion analysis provide tools to analyze single cell motility in large cell populations. However, the challenge is to find a sophisticated method that is sufficiently accurate to generate reliable results, robust to function under the wide range of illumination conditions encountered in phase-contrast microscopy, and also computationally light for efficient analysis of large number of cells and image frames. To develop better automatic tools for analysis of low magnification phase-contrast images in time-lapse cell migration movies, we investigated the performance of cell segmentation method that is based on the intrinsic properties of maximally stable extremal regions (MSER). MSER was found to be reliable and effective in a wide range of experimental conditions. When compared to the commonly used segmentation approaches, MSER required negligible preoptimization steps thus dramatically reducing the computation time. To analyze cell migration characteristics in time-lapse movies, the MSER-based automatic cell detection was accompanied by a Kalman filter multiobject tracker that efficiently tracked individual cells even in confluent cell populations. This allowed quantitative cell motion analysis resulting in accurate measurements of the migration magnitude and direction of individual cells, as well as characteristics of collective migration of cell groups. Our results demonstrate that MSER accompanied by temporal data association is a powerful tool for accurate and reliable analysis of the dynamic behaviour of cells in phase-contrast image sequences. These techniques tolerate varying and nonoptimal imaging conditions and due to their relatively light computational requirements they should help to resolve problems in computationally demanding and often time-consuming large-scale dynamical analysis of cultured cells.


Phase-contrast microscopy is the most commonly used contrasting method to visualize living cells. This is due to the relatively simple configuration of the microscopy instruments, the resulting low costs, and the ability to observe cells without staining and the phototoxicity of the short wavelength light used in fluorescent microscopy. Manual tracking of thousands of cells over time, however, is not feasible, and automatic computer based cell segmentation and tracking approaches are necessary for quantitative studies. In contrast to relative ease of image production, automatic cell segmentation of phase-contrast images has been more challenging. Phase-contrast images of cells suffer from low contrast with respect to the background and there is variation in the pixel intensity distribution within the cells due to varying thicknesses, making identification of single cells difficult. Phase-contrasting also produces an artificial halo effect surrounding the cells that obscures cell interfaces.

Several approaches have been developed for cell segmentation that are based on the separation of connected pixels belonging to the cell from those belonging to the background. Such methods include thresholding, the watershed method and texture analysis (Wu et al., 1995; Koyuncu et al., 2012; Korzynska et al., 2007). Alternatively, active contour algorithms that capture cell boundaries have been used (Tscherepanow et al., 2008; Wang, He & Metaxas, 2007; Ali et al., 2007, Seroussi et al., 2012). The shortcomings in commonly used techniques is the requirement for image preprocessing or the necessity to use accompanying techniques due to nonuniformities of pixel intensities inside the segmented cells, similarities in the background and specimen pixel intensities, or because the cell boundaries cannot be clearly resolved (Wu et al., 1995; Tse et al., 2009; Debeir et al., 2005; Ali et al., 2007; Yin et al., 2012; Ambühl et al., 2012). Manual adjustments and preprocessing techniques require expertise and their use significantly increases computational time. Therefore, reliable automatic analyzing techniques that are more feasible to use in varying conditions are urgently needed.

In 2004 Matas et al. described a technique for detecting regions in an image that remain stable over a range of threshold values, called maximally stable extremal regions or MSERs (Matas et al., 2004). The MSER method has important characteristics, which are useful for the segmentation of objects from complex images. First, the segmented regions are preserved under geometric and monotonic intensity transformations. Secondly, MSER is not sensitive to pixel intensity changes and nonuniformities in background intensities as it is dependent only on the ordering of pixel intensities within the MSER and its outer boundary. The feasibility of MSER in detecting cultured cells was recently recognized in a machine learning based approach in which the detector was used to find candidate regions that represented putative cells (Arteta et al., 2012). In our study, we extended MSER approach to detect cells using a wide range of phase-contrast images and test the feasibility and the intrinsic properties of MSER for the automatic detection of cells under varying and challenging imaging conditions.

In addition to automatic cell segmentation of single still images, we combined MSER with a Kalman filter based tracker modified for multiple objects (Huttunen & Heikkilä, 2008). This tracking approach uses segmentation masks as a source of measurements and utilizes soft assignment to associate the observations to the objects being tracked. The combination of MSER and a Kalman filter based tracking enabled accurate and reliable cell segmentation and migration analysis even in demanding dense cell populations. Because the computational requirements of these approaches were low and the required user interaction was minimal, the methods developed should be useful in computationally more demanding assays such as experiments performed in multiwell plate formats, analysis of living cells in high densities, and as modules in image analysis software.



MSER is a blob detector identifying regions in an image that remain stable over a certain number of thresholds. MSER was originally developed for stereo matching purposes, and later it has been widely used for object recognition. In MSER, maximum region size and stability range parameter (a delta value, Δ) control the segmentation sensitivity and accuracy. Maximum region size defines the upper limit for the size of the valid objects, in this work representing cell nuclei. The adjustment of maximum region size was done first by performing segmentation with default settings and then using a parameter value that corresponds to the mean region size multiplied by 1.25. The Δ-value can be adjusted to set the range of threshold values within which the regions should remain stable (i.e. the region does not grow significantly over the selected threshold range); the higher the Δ-value the broader is the range of threshold values within which the regions should remain stable. In phase-contrast images, the pixel intensity profile characterizing the cells often resembles that of background, which makes cell segmentation a challenging problem. Because cells and in particular cell nuclei are slightly darker than the background MSER could be considered as a potential method for this problem. Those cells that exhibit pixel “leaking” to background or to adjacent cells can be discarded and the risk of false detections can be reduced. In addition to the two abovementioned parameters of MSER, minimum region size, maximum variation and minimum diversity of the detected regions can also be adjusted. In our experiments we found that these parameters were not critical and the default values were used. The MSER was obtained from a VLFeat open source library (

Kalman filter

Kalman filter is a common approach for object tracking as it can efficiently predict and filter the target locations based on previous observations and a dynamic state model. The main limitation of the Kalman filter is that it assumes Gaussian distribution for the observation noise. It would be possible to utilize the MSER regions directly as a source of location measurements in a Kalman filter based tracker but because of frequently occurring detection errors that do not usually follow Gaussian distribution this would easily lead to tracking failures. In order to alleviate this problem we have applied the multiobject tracking approach proposed by Huttunen & Heikkilä (2008) that is based on probabilistic data association where soft assignments of the measurements are used instead of hard assignments. The basic idea in this method is to detect the objects several times from the same frame with varying detector settings and compute soft assignments for each output. This will increase the robustness of tracking against detection errors. The method was originally developed for human tracking, but it can be modified for cell tracking purposes. In our approach we process the binary MSER regions with a filter bank that contains a set of morphological erosions. With this approach we can, for example, separate a binary region of two cells that have been erroneously merged by the MSER detector. As a result we get a large set of measurements (centroids of the binary regions) for each cell, and we assume that these measurements together form a Gaussian mixture, and its modes represent the correct cell locations. A Gaussian mixture model is used for deriving weights for the measurements that indicate the probability of belonging to an individual cell. The extraction for obtaining the measurements is demonstrated in Fig. 1. A probabilistic data association scheme is embedded to the Kalman filter framework to enable multiobject tracking. More details of the data association algorithm can be found from Huttunen & Heikkilä (2008).

Figure 1.

Measurement extraction for tracker. (A) An input image representing cultured squamous cell carcinoma cells. Six individual cells are numbered (1–6). (B) Cells segmented with MSER. Red contours outline the segmented areas and the green dots indicate the centroids of each detection. Note closely adjacent cells that were erroneously merged as a single detection (1–2 and 3–4–5). (C) The corresponding primary masks of the segmented cells. (D) Primary masks after a morphological erosion operation. Red crosses represent new centroids of the primary masks. Note that masks that erroneously merged cells 1–2 and 3–4–5 in initial MSER segmentation were correctly separated. However, the masks corresponding cells 5 and 6 were split in two after erosion operation. (E) After several morphological operations the primary masks may contain many centroids (measurements). The location of centroids is fitted into Gaussian mixture model. The centroids that locate far apart from the others (yellow arrowheads) less likely represent a true location and are down-weighted. (F) The resulting centroids are indicated by red crosses that accurately represent original cell nuclei.

Cell culture and microscopy

Madin-Darby Canine Kidney (MDCK) cells were grown on six-well plates (Costar, Corning Incorporated, New York, U.S.A.) in MEM+GlutaMAX media (Gibco, Life Technologies Corporation, U.S.A.) supplemented with 5% fetal bovine serum (FBS) and 1% penicillin-streptomycin. Squamous cell carcinoma cells (HSC-3) were kindly provided by M.D. Jyri Moilanen and grown in Dulbecco's minimum essential medium (DMEM F-12, Gibco) supplemented with 10% heat inactivated FBS and 1% penicillin-streptomycin. Human umbilical vein endothelial cells (HUVECs) were grown in endothelial cell basal medium supplemented with endothelial cell growth supplement (Cell Applications, San Diego, CA, U.S.A.) and 10% FBS. For the cell motility assay, a scratch was made on the middle of the confluent culture and cells were imaged overnight in a temperature and CO2 controlled microscope stage incubator (Okolab, Italy) mounted onto Olympus IX81 inverted microscope. Images consisting of 1376×1032 pixels were captured with a grey-scale camera (Olympus XM10, Germany) controlled by Cell^P software (Soft Imaging System, Münster, Germany).

Validation of the segmentation

The reliability of MSER segmentation was verified manually by comparing the cells in the microscopic fields and the objects detected by MSER. Before manual annotation the connected component analysis was performed to judge whether closely located regions represent single or separate segmentation. Thereafter the contours of the binary masks of detected regions were overlaid on the original input image to distinguish correctly detected cells from other segmented regions (false positives). The manual counting was performed with Image J (National Institutes of Health, Bethesda, Maryland, U.S.A.) cell counter plug-in (Kurt De Vos, University of Sheffield, Academic Neurology, England). We denote the detection recall as the total number of true positive cells divided by the total number of cells in a frame. As detection precision, we denote the number of true positive cells divided by the total number of segmented regions.

Other methods used in comparison

The segmentation was performed using the commonly available segmentation approaches: Otsu thresholding, watershed and active contours. The morphological watershed segmentation was performed according to Beucher & Meyer (1993). Images were first preprocessed with a median filter to reduce noise and then with a morphological erosion operation to remove roughness inside the cells and to enhance contrast between the cells and the background. Next a gradient magnitude filter was applied. Finally, local minima close to local background were suppressed to prevent over-segmentation. After applying the watershed transform, all objects less than 30 pixels in size were removed. The active contours method was applied according to Caselles et al. (1997). The image was first smoothed with an anisotropic diffusion filter followed by calculation of gradient magnitude. Then the geodesic active contour level set was initialized by four user-selected seed points from which a circle of radius 375 pixels defined the initial zero level set. The front was defined to contract during 7000 iterations that led to splitting of four initial contours into many individual contours defining individual cells. The speed term for the front propagation was calculated from the gradient magnitude image using sigmoid function.

Optionally, prior to applying the abovementioned segmentation approaches, images were first processed with an image preconditioning technique ( described in Li & Kanade (2009). The image preconditioning aims to facilitate object segmentation. To obtain suitable images for watershed transform, the preconditioned images were preprocessed by setting all the pixel values less than 100 to a maximum intensity value (255) and the intensities of all pixels were inverted. This step was necessary to achieve local minima inside of the objects and to avoid overexpansion of the detected regions.

Tracking data were also compared to the MTrack2 approach (Stuurman, University of California and the Howard Hughes Medical Institute), a plug-in of freely downloadable Fiji software (Schindelin et al., 2012). Segmentation of cells was first performed by using MSER. MTrack2 was then applied to the primary mask of segmented cells by defining a maximum velocity of tracked objects to correspond to 20 pixels/frame and minimum track length to 1/frame. In manual verification of the tracking data each track was followed until termination. The causes of lost tracks (the tracks that failed to proceed until the end of image sequence) were categorized in four groups: 1. Track failure in which the tracker loses the target due to segmentation failure of MSER or the tracker shifts to an adjacent object (i.e. cell) during the tracking process. 2. The cell leaves the image border. 3. Cell roundup due to death or mitosis. Both cell death and mitosis are characterized by an initial increase of circularity and pixel intensity and decrease in cell area. In cell division the initial features are reversed and daughter cells emerge. 4. Situations in which the identity (ID) of an object is changed during the tracking process.


The segmentation and tracking approaches were implemented using a PC with Intel Xeon CPU operating at 2.50 GHz and with 128 GB RAM. MATLAB was used for MSER, Otsu and Kalman filtering. BioImageXD (Kankaanpää et al., 2012) was used for watershed and active contours.


Segmentation of single cells in dense cell cultures based on MSER approach

We first tested the effect of the Δ-value on MSER detection recall and precision (Fig. 2 and Table 1) using challenging phase-contrast images obtained from scratched confluent cell cultures that represent the classic “wound closure” cell migration assay. The image sequences 1 and 2 were acquired by different microscope users representing different experimental conditions, cell densities and cell types. In the images analyzed, the pixel intensity inside the nuclei showed the most prominent contrast in relation to the background and was therefore mainly detected by MSER (Fig. 2). Notably, when the default region size was used, Δ-value adjustment markedly contributed to detection outcome; the smallest Δ-values resulted in a higher number of false positive detections as well as artificial merging of single cells (Supporting Fig. S1, Table 1), whereas the detection improved with increasing Δ-values.

Table 1. The effect of Δ-value on detection recall and precision of cells in confluent cultures by using default region size settings/limited maximum region size. The annotation was performed on the first frame of sequence 1 and 2. The total number of cells (verified manually) in a frame is shown in parenthesis. Detection recall, total number of true positive cells divided by the total number of cells. Detection precision, the number of true positive cells divided by the total number of segmented regions
ExperimentΔSegmented regionsSegmented cellsRegions of merged cellsDetection recall%Detection precision%
Sequence 1 (n = 1536)11298/1712864/1344227/8360/9071/81
Sequence 2 (n = 2816)11898/28001075/2112537/30338/7857/78
Figure 2.

First frames of image sequences obtained from wounded confluent squamous cell carcinoma cell cultures (sequence 1, panel A) and MDCK cells (sequence 2, panel B) were used to test the effect of different Δ-values on MSER segmentation with limited maximum region size. Red contours outline the segmented areas most often representing cell nuclei. Green dots indicate the centroids of each detection to distinguish contours that belong to the individual cells. Examples of segmented regions are magnified in the insets at the top right corner of each MSER segmented image. Note that with smaller Δ-values more noncell objects (white arrowheads, exemplified also with higher magnification in the Supporting Fig. S1) are detected in the wound area.

Merging of several cells within a single detection (Fig. 1 and Supporting Fig. S1) was the most common failure, which decreased the detection result. To reduce the incidence of merged regions we next limited the maximum region size. As a maximum value we used the average region size at Δ-value 3 that mostly represented the cell nuclei (red contours in Fig. 2). As shown in Table 1, limiting the maximum region size improved the detection precision in both sequences, more profoundly in sequence 2. These observations could be explained by the fact that with increasing Δ-values the segmentation becomes less sensitive and, if the pixel intensity variation is small within cells and the regions between the cells, the region size is expanded until it fulfils the criteria of maximally stable region. Adjustment of additional three parameters available (minimum region size, maximum variation and minimum diversity) did not improve the segmentation result of MSER.

Comparison of MSER with commonly used segmentation approaches

To compare the cell detection recall and precision of MSER to the commonly used segmentation approaches, the first frames of sequence 1 and sequence 2 were segmented using the Otsu threshold, active contours, and watershed, the techniques that are commonly accessible and which have been applied to phase-contrast images (Table 2). Notably, in contrast to MSER, none of the other techniques could be applied directly to the original grey-scale images, but required manual adjustments in variable degrees.

Table 2. Comparison of MSER performance with three common segmentation approaches. The MSER Δ3 and limited maximum region size were used for comparison. The techniques indicated in the table were applied to the original grey-scale image representing the first frame of sequence 1 and 2. Alternatively, an image preconditioning (preconditioned, Li & Kanade, 2009) technique was used to improve contrast in images before applying a segmentation technique. Active contours and Otsu threshold could not be applied to the original input image to result in a reasonable outcome (-)
Sequence 1  
Segmentation approachRecall%Precision%
MSER Δ38892
Watershed (original)5174
Watershed (preconditioned)8897
Active contours (original)
Active contours (preconditioned)4891
Otsu threshold (original)
Otsu threshold (preconditioned)6697
Sequence 2  
Segmentation approachRecall%Precision%
MSER Δ38188
Watershed (original)8752
Watershed (preconditioned)8996
Active contours (original)
Active contours (preconditioned)5093
Otsu threshold (original)
Otsu threshold (preconditioned)7360

After the preprocessing steps, the morphological watershed segmentation worked moderately well (Fig. 3C and K, Table 2). Similar to MSER watershed transform practically segmented cell nuclei (Supporting Fig. S2). In terms of detection recall, watershed even outperformed MSER after preprocessing (81% and 87% for MSER and watershed, respectively). Simultaneously, however, the watershed segmented image became heavily over-segmented resulting in a poor detection precision when compared to MSER, 52% versus 88%.

Figure 3.

The comparison of MSER segmentation to other segmentation approaches. (A and I) The original input images were the first frames of sequence 1 and sequence 2. (B and J) Segmentation of the input image was performed with MSER (Δ3 and limited maximum region size), (C and K) watershed with morphological erosion and median filtering and (D and L) Otsu threshold. Each detection is labelled with a different colour to help distinguishing the individual segmented objects. (E and M) Processed images from the first frame of sequence 1 and the first frame of sequence 2, respectively, using the image preconditioning technique. Segmentation of the processed image was performed with active contours (F and N), watershed with invert threshold (G and O) and Otsu threshold (H and P).

Active contours and Otsu threshold methods failed to segment cells reliably in the original phase-contrast input images. Otsu thresholding classified the halo artefacts surrounding the cells as a foreground and cells and the surrounding background as background (Fig. 3D and L), a shortcoming recognized also by Yin, Kanade & Chen (2012). This reduced the reliability as the extent of the halo varied between the individual cells and in different parts of cell border, and did not persisted in moving cells.

It was shown previously that grey-scale bright field images produced using a differential interference contrast technique can be processed with the preconditioning technique to significantly improve the segmentation outcome (Li & Kanade, 2009). To improve the outcome of watershed, active contours and Otsu threshold segmentation approaches, we next preprocessed the images using Li & Kanade's technique. This technique produced an image in which cells were represented by bright pixels and the background appeared as uniformly black, thus implying that segmentation of cells will be easier. Accordingly, image preconditioning improved the detection recall and precision of the abovementioned segmentation approaches significantly (Table 2).

The processing times of the techniques used at given image resolution (1376×1032) varied from <0.01 s (Otsu threshold) to 176 s (active contours) (Table 3). MSER segmentation took less than 3 s to complete regardless of the selected Δ-value or region size settings. Preconditioning technique was computationally demanding and required almost three hours to complete per image that made the approach intractable. To better compare the processing time of MSER with watershed and active contours the source code of MSER was transferred to BioImageXD. However, the processing time did not markedly differ from the MATLAB implementation.

Table 3. The processing times (in seconds) of different segmentation approach. The values represent the mean of 10 measurements. Segmentations were performed on the first frame of sequence 1 with MATLAB (MSER and Otsu) or BioImageXD (active contours and watershed). For comparison MSER was also transferred to BioImageXD. MSER was implemented with Δ3 and maximum region size limited. The processing times include all the required preprocessing steps in case of watershed and active contours
Segmentation approachTime (s)Total time/frame
MSER Δ3 BioImageXD3,43,4
Active contours1769368
Otsu threshold<0,019192

Robustness under varying experimental conditions

An efficient segmentation approach should work in varying imaging conditions encountered in time-lapse microscopy, such as nonoptimal illumination, out of focus images, varying cell densities, different cellular phenotypes and appearance. Therefore, we next tested the feasibility of MSER segmentation under different conditions.

First we applied MSER to very complicated phase-contrast images illustrating cultured human colon carcinoma cells (HT29). These cells grow as tight clusters that are scattered throughout the culturing plate (Fig. 4). In addition, the cells were often surrounded by a very prominent halo and exhibited different texture patterns and shapes. As in sequences 1 and 2, default maximum region size setting segmentation resulted in more merged cells if compared to the corresponding segmentation with limited maximum region size. Therefore we tested the effect of Δ-values on detection outcome with limited maximum region size. When using Δ1 the cells were segmented very efficiently (detection recall 85%). Notably, the detector managed to locate individual cells even from tight clusters in which the outlines of cells were hardly distinguishable by the human eye. The detection recall sharply declined with increasing Δ-values (76% with Δ3 and 60% with Δ5). As in sequences 1 and 2 detection precision improved with increasing Δ-values (79%, 89% and 94% for Δ1, 3 and 5, respectively) indicating that only a minority of the detections with higher Δ-values were false positives.

Figure 4.

MSER segmentation of cultured human colon carcinoma cells using different Δ-values and limited maximum region size. Colon cancer cells form complex clusters that in phase-contrast images exhibit variation in pixel intensity profile and the extent of bright halos surrounding the cells. Segmented cells are outlined in red. Green dots indicate the centroid of each segmented area. Note that MSER is able to detect individual cells (marked with red dots in the inset of the input image) even in the middle of cell cluster where the cells are very tightly packed.

We next tested the effect of different illumination conditions on the detection outcome using MSER. A subconfluent MDCK epithelial cell layer was illuminated with different light intensities (Fig. 5). The detection recall and precision of MSER was calculated for different Δ-values with the maximum region size limited (Table 4). Interestingly, Δ-values only slightly contributed to the detection outcome by means of detection recall at moderate and high level of illumination. By contrast, Δ3 and Δ5 resulted in a progressively decreased detection recall at low illumination, but did not diminish detection precision. Δ1 in turn resulted in the best detection outcome under low illumination conditions. The weaker outcome at Δ3 and Δ5 under low level illumination is probably due to the relatively narrow pixel intensity profile in the images. Given that higher Δ-values presume that the connected regions should remain stable within a broader range of threshold steps, the pixel intensity profile obviously is too uniform to fulfil this criterion under low intensity illumination.

Table 4. The effect of sample illumination on detection precision and recall of MSER segmentation. Field of view comprising 499 MDCK epithelial cells
Illumination conditionΔSegmented regionsSegmented cellsMerged cellsDetection recall%Detection precision%
Figure 5.

The effect of different illumination conditions on the segmentation result of MSER. The same sample was illuminated with different light intensities; high (a mean grey value 128 and a high frequency of saturated pixels), moderate (a mean grey value 91) and low (a mean grey value 34). The grey value histograms show the intensity distribution of pixels above the image panels. The Δ-values indicated were used for segmentation, and in each case the maximum region size was limited. The primary masks of the corresponding detections are labelled with different colours to help distinguish individual detections. (A) The sample illuminated with high intensity light results in frequent bright pixel intensities. Bright halos surrounding cells are clearly pronounced (white arrowheads in the inset). (B) When illuminated with moderate intensity light, neither dark nor light pixels are overrepresented and the cells can easily be distinguished from the background by eye. Note that the intensity of bright halos is considerably decreased. (C) At low intensity illumination, the overall intensity profile is dark and the pixel intensities defining the cells resemble that of the background. Bright halos are still detectable that helps to distinguish cells. When higher Δ-values are used the detection outcome decreases significantly at low light illumination, whereas at moderate and high illumination Δ-values have only a minor contribution to the segmentation result.

Image blurring is an unwanted condition encountered in time-lapse microscopy that is often due to a mechanical shift in a microscope focal plane. Therefore we next tested the performance of MSER in out-of-focus images. To approach that, we artificially blurred the image illuminated with a normal light level in Fig. 5 by applying a Gaussian filter of Image J. Under this condition, MSER performed best with Δ1 and limited maximum region size (Fig. 6, detection recall 87% and precision 96%). With higher Δ-values, the detection outcome decreased markedly (detection recall 78% and precision 98% with Δ3; detection recall 21% and precision 100% with Δ5). Interestingly, albeit the default maximum region size resulted in a weaker outcome, it is notable that noncell regions (false positive) were not detected at any of the Δ-values used. Accordingly, the detection outcome decreased because of a higher incidence of merged cells.

Figure 6.

MSER applied to an image that was artificially blurred to resemble an out-of-focus image. A Gaussian filter (Sigma radius 4) was used to generate a blurred image from the original image in Fig. 5B. The effect of different Δ-values with the maximum region size limited on the segmentation result is shown. Segmented cells are outlined with red. Green dots indicate the centroids of each segmented area.

Cell tracking and motion analysis

When tracking individual cells in time-lapse experiments, it is necessary that the cells can be reliably identified in each successive frame. To fulfil this important criterion for automated cell migration assay, we tested the usability of a multiobject tracker algorithm that is based on a Kalman filter. The movies analyzed were sequences 1 and 2 representing the classic wound closure migration assay and consisting of 25 frames acquired at 20 min intervals.

At the beginning of sequences 1 and 2 (the second frame), 96% of MSER detected objects, which were initialized for the tracker in the first frame, were in track in the following frame. However, in sequence 1, only 34% of the tracks were successfully followed until the end of the sequence whereas the number of MSER detected regions showed only a minor fluctuation (Fig. 7A). In sequence 2, 60% of the tracks initiated in the first frame were successfully followed until the end of the sequence. As in the case of sequence 1, the number of MSER detected regions remained the same or slightly increased (Fig. 7B). We hypothesized that the lost tracks should be able to be explained either by failures of MSER in cell segmentation or the inability of the tracker to follow a segmented target. Based on the high detection precision of MSER (at Δ3 and limited maximum region size, Table 1), it can be presumed that the majority (∼90%) of MSER segmented objects are true positive cells, thus suggesting the possibility that the cells may move too long between the frames or suddenly change their direction of movement. To survey the reasons for lost tracks in more detail, tracks corresponding to 41–42 randomly selected cells were followed until their disappearance (Fig. 8A, Supporting Movie S1). In sequence 1, tracking failures accounted for 43% of the lost tracks and the most common explanation for the failures appeared to be associated with sudden changes in the movement directions of the cells (47% of tracking failures, for details see the legend of Supporting Movie S1). Other tracking failures were associated with situations in which MSER failed in cell segmentation (20% of failures). In 27% of tracking failures no clear reason could be specified. A total of 14% of lost tracks were due to change in cell ID; 34% of lost tracks were due to a biological reason, cell death or mitosis, and 11% were due to cell disappearance from field of view. In sequence 2 (Fig. 8C, Supporting Movie S2), 48% of the lost tracks were explained by tracking failures, and 46% of the tracking failures were due to failures of MSER in cell segmentation. In the remaining track failures the tracker lost the original target and shifted to the adjacent cell (38%), or no clear reason for failure could be specified (8%). Only in one case, the tracking failure associated with a sudden change in movement direction. Cell deaths or divisions explained 33% of lost tracks, ID changes 15% and the cells leaving the image border explained the remaining 4%.

Figure 7.

The performance of Kalman filter based cell tracking over time when applied to sequence 1 (A) and sequence 2 (B). 25 frames at 20 min intervals were analyzed for tracking performance. Primary masks generated from MSER segmentation were used to extract measurements for the tracker. White circles, the number of MSER segmented objects that are in track normalized to the tracks initialized in the first frame. Black squares, the number of all MSER segmented objects normalized to the number of MSER segmented objects in the first frame. Note that the number of tracked objects declines with time whereas the number of MSER segmented objects remains almost unchanged. Total number of cells in the first frames of sequences 1 and 2 were 1536 and 2816, respectively.

Figure 8.

Tracking outcomes of 41- 42 randomly selected MSER segmented cells in sequence 1 and 2 by using Kalman filter and Mtrack2 trackers. After applying the trackers to image sequences segmented with MSER (Δ3 and limited region size), the tracks corresponding to randomly selected and manually annotated cells were overlaid on the input sequence. Each track was followed until its termination for 25 frames with 20 min frame rate. (A) Kalman filtering of sequence 1. (B) Mtrack2 applied to sequence 1. (C) Kalman filtering of sequence 2. (D) Mtrack2 applied to sequence 2.

As a sudden change in the direction of movement appeared to be a common cause of lost tracks in sequence 1, the performance of Kalman filter was further analyzed using a synthetic sequence (Supporting Movie S3). The tracker accurately followed the 7 pixel sized object when its dislocation between the subsequent frames was moderate (9 pixels), even when the direction of movement was radically changed (135°). However, when the time interval between the frames was doubled the tracker could no longer follow the object after the change in its direction. To investigate whether the tracking result in a biological sample could be improved by simply increasing imaging rate we next created a time-lapse movie of cultured endothelial cells (HUVEC) acquiring images in every 5 min. At 20 min time interval (every fourth frame included in the analysis), 76% of tracks corresponding to 39 randomly selected cells were lost due to track failures or ID changes until the end of 25 frame sequence. In most cases (56%), the lost tracks were associated with an abrupt change in the direction of movement or a big leap between the image frames. In one case, the track failure was associated with segmentation failure and in two cases the detected regions were heavily fragmented. When the time elapsed between the frames was reduced to original 5 min, 52% of tracks that were lost due to abrupt change in speed or direction of movement were now correctly tracked.

The performance of the Kalman filter based multiobject tracker was next compared to MTrack2. MTrack2 (Figs. 8B and D) tracked cells less accurately in confluent cultures than the Kalman filter based tracker (Fig. 8A and C). MTrack2 continuously altered identification numbers of the segmented cells starting from the beginning of the time-lapse series, thus leading to a very rapid decline in tracking performance. ID changes accounted for 73% and 69% of lost tracks in sequence 1 and 2, respectively. Track failures in MTrack2 resulted either in segmentation failure of MSER or for an unknown reason in which the tracker skipped one frame and identified the original cell as a new target in a subsequent frame.

In addition to tracking of individual cells in a dense cell population, it is often biologically relevant to analyze the collective motion of clustered cells to reveal coherent cellular behaviour (Friedl & Gilmour, 2009). To investigate collective migration patterns, the Kalman filter based tracker was applied to sequences 1 and 2 (frame rate every 20 min). In this analysis, new tracks were initiated whenever new cells emerged and were detected by MSER, representing the cells entering to the field of view from outside the border, daughter cells after mitosis, or the cells that MSER failed to detect in previous frames. Labelling of the primary masks of MSER segmented cells with different colours indicating movement directions and magnitudes allowed visual inspection of the heterogeneity as well as consistency in cell movement patterns (Supporting Movies S4 and S5). In combination with quantitative presentation of movement directions and magnitudes by means of rose diagrams, we made spatiotemporal analyses of the cell population motilities and compared them between the cell cultures of squamous cell carcinoma cells (sequence 1) and normal epithelial cells (MDCK cells, sequence 2) (Fig. 9). Immediately after scratching the monolayer of MDCK cells, the majority of the cells on the left side of the scratch moved towards the wound, whereas a significant fraction of the cells on the right side moved towards the right border of the image (Fig. 9B, Supporting Movie S5). From the fifth frame onwards (1 h 20 min after wound scratching), a major fraction of the cells on both sides moved towards the wound. Some individual cells or small groups of cells showed diverse movement patterns throughout the sequence. During the progression of wound closure, the fraction of independently moving cells decreased and most of the cells in the field of view showed a coherent movement toward the wound. The migratory behaviour of squamous cell carcinoma cells (sequence 1) (Fig. 9A, Supporting Movie S4) was more stochastic both in space and in time, and the cancer cells showed a wider range of movement patterns when compared to MDCK cells in sequence 2. Moreover, a significant fraction of cells from the fifth frame onwards moved at a higher magnitude than normal epithelial cells. Most of the cells adjacent to the wound margin showed a clear directionality toward the wound, but in the margin area, some cell groups moved in the completely opposite direction. Still each movement direction typically comprised several cells, which suggested that in this assay the particular cell type tend to move as groups (Supporting Movie S4) rather than as single cells.

Figure 9.

Rose diagrams demonstrating the motility of MSER segmented cells in sequence 1 (A) and sequence 2 (B). The diagrams show that the motility pattern of squamous cell carcinoma cells (A) differs markedly from normal MDCK epithelial cells (B). MSER (Δ3, region size limited) was used for segmentation followed by Kalman filtering. New tracks were allowed to be initiated throughout the sequences. Rose diagrams represent movements taking place at every fifth frame of a 25 frame sequence for each side (left and right) of the “wound”. Movement directions (degrees) are shown at the outer circle of the diagrams. Inner circles represent movement magnitudes in pixels ascending towards the outer circle. The proportion of cells that move in a particular direction at a particular magnitude of all moving cells is depicted in heat map form, blue representing lower and red higher values.


Among the commonly accessible image analysis software, segmentation approaches that can be directly, without preprocessing, applied to the phase-contrast raw grey-scale images are lacking. To improve the possibilities for automatic analysis of microscopic images acquired with the phase-contrast technique we tested the performance of MSER segmentation and Kalman filter based cell tracking. We found that MSER has several advantages over commonly used methods in cell segmentation, and when MSER was accompanied by the Kalman filter based tracker it allowed computationally efficient cell motility analysis.

In order to segment cells with high accuracy only two parameters, Δ-value and maximum region size, are adjusted in MSER making it simple to use. We also found that in biological experiments MSER can be applied to a wide range of phase-contrast images, even to images representing complicated confluent cell cultures. MSER is considered to be a robust technique tolerating illumination changes (Mikolajczyk et al., 2005) and thus likely operating well with images generated with different microscopic instruments. In our study, we found that MSER efficiently segmented cells under nonoptimal conditions including variable illumination and in out-of focus images. Moreover, MSER has low computational requirements, which facilitates analysis with a standard PC. Most importantly, MSER can readily be applied to grey-scale images without any preprocessing of the input image, thus outperforming the commonly used segmentation approaches watershed, active contours and Otsu threshold.

MSER detects the regions of cells having “extremal” properties, i.e. all the pixels inside the detected region have either higher or lower intensities than all the pixels in the immediate outer boundary (Mikolajczyk et al., 2005). Therefore, the background of an image may exhibit nonuniformities in pixel intensity profile or even be similar to MSER regions without having an effect on the detection outcome. However, cells in phase-contrast images may exhibit disorganization of pixels in their boundaries due to changes in cell shape and thereby the light path passing through the cell. In some parts of an MSER-defined region, the pixels inside the region and in the background may be mixed by means of similarities in pixel intensities and the criterion of maximally stable region is no longer fulfilled. This affects detection outcome by means of detection recall. Moreover, most failures in detection recall, especially in the case of confluent cell cultures, can be explained by merged detections in which more than one cell are segmented within a single region. Despite the wide feasibility of MSER, there are complex situations that are especially challenging, such as in cancer cell cultures where cells tend to grow on top of each other. Images that are clearly out-of-focus may also weaken the outcome of MSER segmentation as noted in our artificially blurred test images. This is due to smoothening of the boundaries of the detected regions (Mikolajczyk et al., 2005), which may result in their instability, especially when higher Δ-values are used.

Tracking of individual cells in dense cell populations is challenging to many existing motion analysis approaches (Hand et al., 2009). Tracking performance is often dependent on cell detection accuracy, and despite the relatively low error rate of MSER segmentation, detection errors still occur and they are very harmful to tracking. This problem can be alleviated by employing the Kalman filter based multiobject tracker that is robust to detection failures. As for any tracker, it is important that the interframe distance or movement direction of an object does not vary too much. The position of a cell as a function of time is dependent on the cell type, phenotype and experimental variables. In this study, three different cell types, namely squamous cell carcinoma cells (HSC-3), normal kidney epithelial MDCK cells, and primary endothelial cells (HUVECs), were imaged in challenging high density cultures. We found that the Kalman filter based tracker performed better than MTrack2 of Fiji that relies on the identification of objects that are closest together in successive frames (nearest-neighbour method). Interestingly, the reasons for tracking failures were found to be different between sequences 1 (squamous cell carcinoma cells) and 2 (epithelial MDCK cells) when Kalman filter based tracker was used; in sequence 1 the major reason was the sudden changes in migration directions of cells, whereas in sequence 2 the tracking failed most often due to segmentation failure.

In cell motility analysis, motion vectors were calculated for each MSER segmented cell between successive frames, and the results were presented as rose diagrams. This enabled simultaneous analysis of movement magnitude and direction as well as calculation of the proportion of cells that participated in a given type of movement. Motility analysis revealed considerable differences in movement patterns between different cell types. As previously observed (Farooqui & Fenteany, 2005; Poujade et al., 2007), most epithelial MDCK cells exhibited directional migration toward the wound. However, the assay also identified those cells having differential movement pattern as certain cells were able to change their direction and move somewhat independently from adjacent cells. In contrast to MDCK cells, squamous cell carcinoma cells in sequence 1 showed more heterogeneity in their movement pattern, higher velocity and also lower tracking accuracy. The careful analysis of cell movies and a synthetic image sequence revealed that the faster migration speed and the concomitant sudden changes in movement direction results in weaker tracking outcome, that can be significantly improved by simply increasing the image acquisition rate. In the case of HUVECs, the elapsed time between the framers, resulting sufficiently accurate results, was 5 min.


Modern microscopy instruments produce a large amount of image files whereas image analysis has become a significant bottleneck. Effective analysis requires accurate automatic image analysis approaches that are computationally light. The novelty of our work arises from a combination of the blob detector MSER with Kalman filter based tracker. MSER allowed accurate cell detection under various imaging and culturing conditions whereas the tracker compensated for the segmentation errors emerging from artificial merging of cells and increased the reliability of tracking. When compared with other commonly available approach, the combination of the multiobject tracker and MSER worked well in time-lapse imagining experiments of dense cell populations composing of thousands of closely adjacent cells in each image frames. We also show that in confluent culture appropriate image acquisition rate significantly improves the tracking of fast moving and closely adjacent cells.


This work was supported by grants from the Biocenter Finland (to LE and JH), Academy of Finland (136880 and Centre of Excellence Program 2012–2017 to LE; 257125 and 218109 to VM), and the Finnish Funding Agency for Technology and Innovation (FiDiPro project 40338/12 to VM). Mrs. Riitta Jokela is acknowledged for providing MDCK cells and Anne-Maria Pajari PhD (Department of Food and Environmental Science, University of Helsinki) for providing the original input file for Fig. 4. Juho Kannala DSc (tech) is thanked for critical reviewing of the manuscript, and Ilya Skovorodkin (PhD) and Veli-Pekka Ronkainen (PhD) for advice on microscopy. Mr. Antti Viklund is acknowledged for valuable advice with Fiji.