Performance of Autofocusing
The 26 slides in the testing set of 2010 were used to evaluate the performance of dynamic autofocusing accuracy. The search space was set to ±200 μm around the focus position, and its length was set to 201 steps with a step size of 2 μm. The 260 experimental tests showed that the dynamic autofocusing accuracy of our method was 2.06 μm. Totally 5,200 focused images were automatically acquired for evaluating the satisfaction degree of autofocusing by Path.1 and Path.2, who were asked to classify these images into three degrees according to his observation. As a result, there were 5,112/5,117 clear images, 41/37 poor images, and 47/46 error images obtained by Path.1/Path.2, respectively. The common satisfaction degree was 98.2%. In addition, 14 steps of z-motor (about 3 s) were required throughout one autofocusing process on average.
Performance of Cell Segmentation
In our segmentation method, there are two major parameters that we tuned on a set of training images. The cluster number in the global segmentation was tuned on 30 training images. We set it to 3, 4, 5, and 6, respectively. Correspondingly, the cytoplasm segmentation accuracy Acc was 0.89, 0.93, 0.93, and 0.91, respectively. Considering the tradeoff between accuracy and computational complexity, we chose 4 as the cluster number. The sσ in the LAGC was tuned on 30 nucleus images. We varied the value of sσ (3, 5, 10, 15). The F-measure was 0.80, 0.84, 0.88, and 0.88, respectively. Considering the computational burdens increasing with more clusters and sσ, we suggest using cluster number = 4 in and sσ =10. Other parameters were tuned empirically or were set as the same as our previous studies [49, 53]. Since these parameters mainly depend on cell sizes, our method should generalize well when images are captured using the same objective amplification.
The average Acc obtained by the methods in Refs. 16, 17 and 19 and our proposed global segmentation are 0.64, 0.68, 0.76, and 0.93, respectively. All the three compared methods are obviously worse than our cytoplasm segmentation method. In Gençtav et al.'s  method, the only parameter is the radius of the disk of the black top-hat algorithm. We found that this method had difficulties in obtaining consistently satisfactory results with the same set of this parameter for different images. So we used a radius of 210 pixels as set in Ref. 16. For Harandi et al.'s  method, the segmentation results will be different if the parameter ν and iteration step of the active contour model have different values. Based on our empirical test, the iteration step is set as 5,000 to ensure the convergence of the contour, and ν is set as 0.8 in accordance with Ref. 17. Our cytoplasm segmentation method takes 0.3 s per image on average.
Figure 7 shows the comparison of nucleus binarization results on an image patch, in which there are two abnormal nuclei (pointed by green arrows). The result of LAGC looks much better than those of other algorithms in terms of the binarization accuracy, especially the abnormal nucleus binarization accuracy. For example, the two abnormal nuclei are wrongly binarized by Li et al.'s method (Fig. 7b) and Al-Kofahi et al.'s method (Fig. 7c).
Figure 6. Comparison of (a) abnormal cell images and (b) “hard negatives” using N/C ratio feature. In each graph, the first through the third columns illustrate the image samples, corresponding regions of nuclei and cytoplasm, and the N/C ratio values. [Color figure can be viewed in the online issue which is available at wileyonlinelibrary.com]
Download figure to PowerPoint
Table 4 shows the comparison of LAGC and the other two algorithms in terms of average precision, recall and F-measure of binarization for both normal nuclei and abnormal nuclei. The LAGC has a 0.873 F-measure and a 0.884 F-measure on binarization of all nuclei and abnormal nuclei, respectively. For nucleus detection, the LAGC achieves a 0.99 detection rate of nuclei, whereas Ref.  is 0.8 and Ref.  is 0.78. Furthermore, for LAGC, 6.2%, 93.8%, and 51.6% of abnormal nucleus binarization results are poor, acceptable and very accurate, respectively, whereas for Ref. ,the results are 45.3%, 54.7%, and 17.2%, respectively, and for Ref. ,the results are 48.4%, 51.6%, and 20.3%, respectively. From the three evaluations, we can see that the LAGC outperforms the other two methods. It should be noted that in these evaluations, touching nuclei are considered as a whole object.
Table 4. Comparison of average nucleus binarization performance using pixel based criterion
|Method||All nuclei||Abnormal nuclei|
The original version of Li et al.'s method  is designed for single cervical cell images. We extend it to process all nuclei in an image. Our extension includes eliminating the shape constrains of candidates and performing the radical GVF snake on all candidates. Following Ref. ,the parameters α, μ, β, δ, γ, and θ are set as 1, 1, 5, 0.5, 10, and 2, respectively. Al-Kofahi et al.'s method is implemented in the Farsight open source project . In comparison, their GC based binarization algorithm is utilized.
There are totally 549 overlapping nuclei in our images. Among them, 508 (92.5%) are correctly split, 15 (2.7%) are undersplit, and 26 (4.7%) have encroachment errors. Furthermore, the average precision, recall and F-measure of the reconstruction results are 0.92, 0.87, and 0.89, respectively.
Figure 8 shows cell segmentation results from our test image set. It can be seen that our proposed methods can accurately delineate the boundaries of cytoplasm in H&E stained images in presence of inhomogeneous illumination, inconsistent staining and dirt occlusion, and achieves promising segmentation results for nuclei/touching nuclei having weak staining and nonuniform chromatin distribution. The average time-cost for the whole nucleus segmentation procedure is about 1.6 s per image.
Figure 7. Comparison of nuclei binarization results with (a) ground truth, (b) Li et al.'s method , (c) Al-Kofahi et al.'s method , and (d) LAGC. The edges of binary masks are overlaid on the color images. [Color figure can be viewed in the online issue which is available at wileyonlinelibrary.com]
Download figure to PowerPoint
Performance of Cell Classification
To train the Classifier 1#, a total of 2,089 nuclei and 5,223 artifacts were collected from the training set and the validation set. The Classifier 2# was trained from 1,126 abnormal nuclei and 1,126 normal nuclei. As shown in Table 5, with fivefold cross validation using different classifiers and the feature ranked by QMI, the nucleus/artifact classifier and the abnormal/normal nucleus classifier achieved the highest CCR when trained by RF and MLP, respectively.
Table 5. The performance of nucleus/artifact classifier and abnormal/normal nucleus classifier using four classifiers
|Classifier||Nuclei/artifacts, %||Abnormal/normal nuclei, %|
Figure 9a shows the ranked features evaluated by RF classifier. The maximum CCR was achieved when eliminating the following four features, entropy , average intensity , variance , and average color . The best feature selected by QMI is roughness index (RI; 15). Figure 9b shows the ranked features evaluated by MLP classifier. According to this curve, the five features included average color , average intensity , boundary intensity , standard deviation of the normalized radial length (SDNRL; 13), and area ratio (AR; 14) were eliminated. Note that the best five features selected by QMI were perimeter, longest diameter, convex hull area, LBP mean value, and area.
Figure 8. Automatic segmentation results for cervical cell images with original resolution. The upper two images contain abnormal cells, whereas the lower two are normal cases. The boundaries of the cytoplasm and the nuclei delineated by our methods are marked as yellow and green, respectively. [Color figure can be viewed in the online issue which is available at wileyonlinelibrary.com]
Download figure to PowerPoint
Figure 9. The upper and lower figures show the variation in CCR with the remaining features used in the RF classifier and MLP classifier, respectively. The maximum CCRs were achieved using 14 features and 18 features, respectively. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]
Download figure to PowerPoint
To further improve the TPR of the Classifier 1# and the Classifier 2#, the positive classes were oversampled at 50%, 100%, 200%, 300%, 400%, 500%, and 600% using SMOTE, where the number of nearest neighbors was set to be 5 as in Ref. . Based on the performance using fivefold cross validation on the nucleus/artifact data set, we chose to oversample the nucleus class at 300%, because it has the highest CCR (98.0%), and relatively high TPR (99.0%) and TNR (96.9%). On the abnormal/normal nucleus data set, we chose to oversample the abnormal nucleus class at 400%, since the highest CCR (94.3%), the relatively high TPR (98.0%), and TNR (90.5%).
The approximate N/C ratio feature was extracted from 1,126 abnormal nuclei and 1,176 hard negatives to train the Classifier 3#. According to analysis of the ROC curve, the N/C ratio threshold was set as 0.098, which achieved a high sensitivity (98.2%) and a promising false positive rate (19% simultaneous). This setting results in a slight decrease on the system sensitivity but a significant alleviation on observer's burden of targeted reading. Among the classification steps,  artifacts filter, Classifier 1# and Classifier 2# took 0.2 s on average and  atrophic cells filter, context analysis and Classifier 3# took 1 min on average.