Automatic toxic granulation detection and grading based on speeded up robust features

Authors


Toxic granulation (TG) is the term used when the normally faint stippled granules in neutrophils stain an intense reddish violet, which is a consequence of activity against bacteria or proteins and is observed in serious infections, toxic or drug effects, or autoimmune processes (e.g., chronic polyarthritis) (1). In clinical laboratories, TG could be simply reported as present or graded as either slight, moderate, or marked (i.e., 1+ through 4+) by laboratory professionals. Grading level depends on the degree of TG in individual cells and the relative fraction of neutrophils with TG (2). Irrespective of the hematology analyzer, ∼15% of the blood samples require manual microscopic observation either because of biological rules or analyzer flags (3), and TG report helps to arrive at a diagnosis in some cases during this process. However, the evaluation of TG is time-consuming and subjective, which limits its application. Despite the significant improvements in automated blood film analysis systems in recent years, the automatic evaluation of TG anomaly has not been directly researched yet.

Cell Profiler (4) and NIS-Element (Nikon Corporation, Tokyo, Japan) supply interactive and semiautomatic ways composed of granule segmentation and parameter measurement for granule analysis, like the module IdentifyPrimaryObjects in Cell Profiler and Object Count function in Element. However, the following reasons impede the robust segmentation of toxic granules:

Changes in stain quality of smears and illumination lead to color variation in TG and its cytoplasmic background.

Size and distribution of TG cover a wide range in microscopic images. Moreover, compact distribution of TG under severe clinical conditions make the exact granule detection and segmentation a difficult task.

Different from the segmentation-and-measurement method, Theera-Umpon and Dhompongsa (5) calculated granulometric features of nucleus using mathematical morphology to improve the classification rate of leukocytes. A similar module (named MeasureGranularity) has also been integrated in Cell Profiler for granularity measurement. However, the mathematical morphology could only give a rough estimation on the granule size, which is not enough for detailed analysis. Hence, current measures need to be improved for fully automatic and quantitative analysis of TG. We show that a high-efficient blob detector could detect toxic granules and estimate their sizes through scale space, avoiding the demand for precise segmentation. Thus, it is more robust and superior for efficient and automatic TG detection and analysis.

Toxic granules are usually brighter or darker than surroundings with relatively homogeneous interior, which are described as blobs in computer vision. We adopt a scale-invariant blob detector for the detection and size estimation of toxic granules and proposed seven features to quantitatively measure the TG level for grading. Scale-invariant blob detection originates from the theory of automatic scale selection (6). Blob-like structures are detected in images with their own characteristic scale, avoiding a segmentation process. A bunch of improving algorithms is proposed and comparing studies are conducted (7–10). Among these methods, speeded up robust features (SURF) outperforms the others for its fast computation and high stability. We first apply SURF to localize granules in the image and scale space. Then, seven features are extracted to quantitatively describe them. The method is evaluated through automatic TG detection and grading by comparing the result to those obtained by an expert. Experiments show that both sensitivity and specificity for TG detection are promising, and the percentage agreement of grading is acceptable.

TG DETECTION AND GRANULE FEATURE EXTRACTION

Our blood samples were peripheral blood obtained from the routine workload of the Clinical Laboratory at Peking University First Hospital. Blood films were stained with Wright Giemsa stain. An Olympus BX41 microscope with a SONY DXC-390P 3CCD color video camera connected to a standard PC was used to acquire image data. In accordance with the objective used in a standard manual microscopic leukocyte differential, we utilized a 100× oil immersion Plan Semi Apochromatic objective (1.30 NA, 0.2 mm W.D.) for the acquisition of images. The illumination technique is bright field microscopy in this case.

Before the detecting of TG, image preprocessing is conducted to improve the image quality and cytoplasm is segmented to serve as the mask for TG detection. Images are first transformed into HSI space (hue, saturation, and intensity), for this transformation reduces correlation between the color channels (compared to RGB) and enables dealing with the three channels separately (11). Granules show the greatest contrast in the saturation channel, thus following process is conducted in this channel. A linear shift and scaling is performed to normalize the saturation image. For segmentation of the cytoplasm, T. Bergen's method (11) is adopted, which combines pixelwise classification with template matching to locate erythrocytes and uses a level-set approach to get the exact contours of leukocyte nucleus and cytoplasm regions.

Subsequently, a SURF detector is used to localize granules in the saturation image masked by cytoplasm region. SURF is a scale- and rotation-invariant interest point detector and descriptor. The Gaussian scale-space representation for a given image f (x, y) is a family of derived signals L (x, y; t) defined by the convolution of f (x, y) with the Gaussian kernel

equation image

such that

equation image

The scale parameter t indicates the scale level being defined. Multiscale blob detector at any given fixed scale can be obtained from local maxima and local minima of the determinant of the Hessian matrix

equation image

By approximating Gaussian second order derivatives using box filters and relying on integral images for image convolutions, interest points are localized in the image and over scales. Then the original interest points are interpolated in scale and image space to get accurate locations. The interpolated value is also used to reject false positive granules with low contrast, whose threshold is settled by the mean saturation of the cytoplasm and nucleus here. Eventually, interest points within a certain Euclidean distance dm are emerged, in this case dm = 4.

After granules are located in their characteristic scales, seven attributes are extracted to describe the granules within the same cell:

  • Number of granules Ng.

  • Density of granules Dg:

    equation image

    where Acyto is the area of cytoplasm.

  • Average scale μs. The size of each granule could be directly reflected by the scale it is extracted. Hence, the scales are considered as relative granule sizes and average scale implies the mean size of granules in the cell.

  • Coefficient variance of granule scales CVs

    equation image

    where σs is the standard deviation of scales.

  • Saturation contrast between the granule centers and cytoplasm SC:

    equation image

    where Mcenter is the mean saturation of the granule centers, and MCyto is the mean saturation of the cytoplasm.

  • Uniformity coefficient of granule size UCg:

    equation image

    where D60 is the diameter of a sieve that just allows 60% of aggregate to pass through, and D10 is the sieve size at which 10% passes.

  • Spatial distribution uniformity of granules DUg. Suppose a cell is separated into small grids with fixed length (e.g., 20 pixels). The granule density of the grids whose majority of pixels is cytoplasm is calculated separately. Then, DUg could be quantified as following:

    equation image

    where Dgi is the granule density of the ith grid, and m is the number of grids counted here. DUg evaluates the distribution uniformity of granule positions for grading and supplies useful information for rejecting false positive granule disturbs, like the dark edge of cytoplasm or neighboring platelets that are incorrectly incorporated into cytoplasm through automatic segmentation.

AUTOMATIC TG GRADING

After feature extraction, a K-nearest-neighbor (KNN) classifier is constructed based on the features of the sample set and assessed in the test set. In the KNN classifier, a new point is classified according to the majority class membership of the K closest training data points, in this case K = 3. To reduce the correlation of the features and improve the generalization property of the classifier, features are first compressed by principal component analysis (PCA) before classification. PCA is the orthogonal projection of the data onto a lower dimensional linear space, known as the principal subspace, such that the variance of the projected data is maximized (12). The projected variables, called principal components, are actually uncorrelated and linear transformations of the original observations, combined in such a way that the first principal component has as high a variance as possible (accounting for as much of the variability as possible), and each succeeding component in turn has the highest variance under the constraint that it be orthogonal to the preceding components. The dimensionality of the projected space is determined here by choosing the dimension that just explains over 80% of the total variation.

To evaluate the proposed method for automatic TG detection and grading, 505 neutrophil images were selected to form the Sample Set and Test Set, including 192 normal neutrophils and 313 neutrophils with TG. Neutrophils were initially graded to five categories (0 through 4+). Then, the categories were combined and grouped as Level 0 (no TG observed), Level 1 (1+ and 2+ TG), Level 2 (3+ and 4+ TG) (see Table 1 for exact numbers of cells for each level and Figure 1A for sample images), to improve the statistical significance. Figure 1B illustrates a sample of the automatic extracted granules in a neutrophil. The granule localization and size estimation result shows high consistency with reality. The variation-explanation ratio of projected space from the sample features by PCA is shown in Figure 1C. The first two principal components explain 74.2% of the total variation, and the first three components explain 85.3%. So, the TG detection and grading is conducted within the first three principal components. Figure 1D exposes the data of the three classes of samples (Level 0, Level 1, and Level 2) in the first and second principal component coordinates. The data show considerable differences among the three levels. For automatic TG detection experiment, Level 1 and Level 2 are combined as TG positive level, and we achieved a sensitivity of 85.2% and a specificity of 96.4%. Therefore, the proposed method is reliable for predicting TG. For automatic TG grading, percentage agreement for Level 0, Level 1, and Level 2 are individually 96.4, 67.3, and 74.7%. Twenty-nine cells are misclassified between Level 1 and Level 2 (12 cells from Level 1 to Level 2, and 17 cells from Level 2 to Level 1). Affected by staining variation and other morphological disturbs (like toxic vacuolation and Döhle body), the misclassification between Level 1 and Level 2 dominates the errors in the grading result.

Figure 1.

Sample images for different levels of TG and automatic extracted granules result. (A) Sample images of TG; left to right: Level 0, Level 1, and Level 2. (B) Detected TG in a neutrophil, the circles and their radius indicate the granule location and size estimation. (C) The variation-explanation ratio of projected features by PCA, each bar plots the percent variability explained by the corresponding principal component, and the line above the bars shows the cumulative percentage. (D) Scatter diagram showing the three classes of samples in the first and second principal component coordinates, (pentagrams) Level 0, (circles) Level 1, (asterisks) Level 2. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

Table 1.. Cell numbers of each toxic granulation (TG) level separately in sample set and test set
TG LEVELSAMPLE SETTEST SET
  1. Cell images from sample set are used for the training of the classifier, and those from test set are used for the test of the proposed method. Level 0 indicates cells without TG; Level 1 contains cells manually grouped into TG 1+ and 2+, and Level 2 incorporate cells in TG 3+ and 4+.

055137
158107
25295

SUMMARY

Experiment shows that the sensitivity and specificity for TG detection are both promising, and the percentage agreement of TG grading for Level 1 and 2 is acceptable. However, the three-level grading of TG is relatively rough for detailed TG grading. Considering that the TG grading for a single cell is qualitative and subjective, this roughness may be less important, for the TG grading of the whole slide depends on the statistic of a certain number of neutrophils detected in the slide. Because of the high stability of SURF and image normalization, the algorithm also shows excellent robustness with light and stain interference. However, a standard and consistent stain process is preferred, for it is helpful for precise segmentation and granule analysis. As toxic granules are relatively small objects in the image, the focusing quality of images should also be guaranteed because defocusing could disturb the TG detection by elevating the scales and reducing the contrast of granules. This TG detecting method could also be used for counting particles occupying a cell. Detected objects should be categorized according to their sizes (scales) for content analysis.

In conclusion, we believe that the proposed TG analysis method is useful for automated TG detection and grading. This method could be separately used for TG analysis, or as a module integrated in the automated blood film analysis system for granule feature extraction and automatic TG evaluation. Future work would be focused on the detailed TG analysis and developing practical methods for automatic detection of other morphological abnormalities, such as toxic vacuolation, Döhle body, hyposegmentation, and hypersegmentation. These abnormalities are often associated with severe infection or other clinical conditions (2). Automatic and quantitative analysis of them could provide a comprehensive description of the leukocyte morphological abnormality for clinical diagnosis.

Ancillary