A digital pathology tool for quantification of color features in histologic specimens

Abstract In preclinical research, histological analysis of tissue samples is often limited to qualitative or semiquantitative scoring assessments. The reliability of this analysis can be impaired by the subjectivity of these approaches, even when read by experienced pathologists. Furthermore, the laborious nature of manual image assessments often leads to the analysis being restricted to a relatively small number of images that may not accurately represent the whole sample. Thus, there is a clear need for automated image analysis tools that can provide robust and rapid quantification of histologic samples from paraffin‐embedded or cryopreserved tissues. To address this need, we have developed a color image analysis algorithm (DigiPath) to quantify distinct color features in histologic sections. We demonstrate the utility of this tool across multiple types of tissue samples and pathologic features, and compare results from our program to other quantitative approaches such as color thresholding and hand tracing. We believe this tool will enable more thorough and reliable characterization of histological samples to facilitate better rigor and reproducibility in tissue‐based analyses.

carefully controlled workflows that ensure consistency in sample preparation and reduce interobserver variability. 8  Digital color thresholding is a commonly used method to automate the analysis of features of interest in color images of histologic samples. 4,[11][12][13] In these methods, a threshold value is identified for each of the three RGB (red, green, and blue) color channels to isolate a specific subset of color shades. However, color features in standard histologic stains (e.g., hematoxylin and eosin [H&E]) are typically blended shades of the three RGB colors. As a result, what is visually distinct to the eye can be difficult or even impossible to isolate using a simple color thresholding approach. To overcome this limitation, more sophisticated approaches have been developed that can identify single features of interest within specific stains. [14][15][16] While useful for certain focused applications, these approaches are limited by their lack of adaptability to any color feature of interest regardless of histochemical stain. Thus, there remains a need for software that has the efficiency and reproducibility of the existing automated methods but is also easily adaptable to many different types of histological features/stains within a single workflow.
The aim of this work was to address the need for more rapid, reliable, and adaptable methods of digital analysis in histological specimens. To that end, we developed and validated an automated image analysis approach (DigiPath) that uses a color-based classification algorithm to identify and rapidly quantify areas of interest in color images. We demonstrate that this approach is accurate, reliable, and significantly faster than a standard method of hand tracing areas of interest. We also show that it can be used for assessment of a wide array of different histological features in human and animal biopsy specimens. Based on the evidence presented here, we believe DigiPath can enable comprehensive, reproducible, and rapid analysis of histology specimens in preclinical research.
F I G U R E 1 DigiPath is a more efficient method for quantification than hand tracing. (a) Representative image from an H&E section of a kidney during normothermic machine perfusion (NMP). Obstructions are quantified using hand tracing or DigiPath by three individual users (User 1magenta, User 2-yellow, User 3-cyan). (b) Total area quantified using hand tracing or DigiPath methods from three users. (c) Total time elapsed for hand tracing (circles, black line) or DigiPath (squares, gray line) methods across three users for five separate images. Mixed-model ANOVA showed a significant difference between the DigiPath and hand tracing cumulative analysis times (**p = 0.0027) 2 | RESULTS

| DigiPath yields more efficient results when compared to hand tracing methods
Hand tracing is frequently used as a standard method to quantify areas of interest in IHC stained tissue sections from preclinical biopsy specimens. We therefore compared DigiPath to traditional hand tracing as a gold standard. Three independent users were asked to quantify areas of microvascular obstruction in human kidney biopsies that underwent normothermic machine perfusion (NMP) using both hand tracing in ImageJ and DigiPath. These microvascular obstructions (not classic thrombi) are structures unique to the serum free perfusate conditions of NMP and are easily identifiable on histology. 9 With DigiPath, we observed that features were systematically $1%-3% smaller as compared to hand tracing. This result is consistent with a small and systematic over estimation of size by users when hand tracing features (Figure 1a and Supplemental Figure 1). We also found that images with more positive area (e.g., Image 4) were subjected to higher inter-user variability ($21% error of the mean) with either method (Figure 1b). Nevertheless, there was general agreement in quantified area between hand tracing and DigiPath methods (Figure 1b). Although inter-user variability was observed, we found that each user reported similar relative trends in the amount of positive area (i.e., User 1 values > User 2 values > User 3 values).
While DigiPath produced similar results to hand tracing, we found that DigiPath significantly reduced the time of analysis. To quantify five images by hand, users took on average 98.5 ± 80.3 min. However, when using DigiPath to quantify those same five images, users took on average 7.6 ± 1.7 min in total. This number includes the time it took users to set parameters in training images and process images of interest. A repeated measure mixed-model two-way analysis of variance (ANOVA) showed that the method of analysis (i.e., DigiPath vs. We next extrapolated how long it might take to analyze 500 images, a typical number of images in our prior studies. We estimate that we would save at least $167 h of quantification time compared to hand tracing (Supplemental Figure 2). Analyzing 500 images by hand tracing is unrealistic and would be unlikely to be carried out in a study. Nonetheless the extrapolation from the average hand tracing time per image provides a conservative estimate of the amount of time that can be saved by quantifying color image features with DigiPath. It also demonstrates how DigiPath analysis can enable a far more in-depth quantitative analysis than is practical with a manual approach.
F I G U R E 2 DigiPath achieves better correlation with hand-traced standards than color thresholding. (a) Representative images of a human kidney section stained with H&E. Masks of microvascular obstructions were generated by hand tracing (a composite of three independent user tracings), color thresholding and DigiPath (overlays of three independent users: User 1-cyan, User 2-yellow, User 3magenta). Areas of undercounting (orange arrows) and overcounting (green arrows) from color thresholding are shown. (b) F-score, Matthews correlation coefficient (MCC), and Youden's J statistic were calculated to measure the correlation of results from thresholding and DigiPath methods with the hand-traced standard. Lines represent median. **p < 0.01; ****p < 0.0001. Scale bars = 20 μm 2.2 | DigiPath achieves greater correlation with hand-traced standards than color thresholding Standard thresholding methods-which pick a specific threshold value of intensity for each of the three colors to distinguish between feature versus background-are reliable only under conditions where the pathologic features are predominantly a single distinct color (red, green, or blue). However, typical pathologic features (and tissue backgrounds) are mixes of the individual red, green, and blue color channels meaning that they cannot be easily separated this way without either setting the threshold too high (leading to high false negative rates) or setting the threshold too low (leading to high false positive rates). To compare the accuracy of DigiPath to a standard color thresholding approach, three independent users quantified a set of six images using both DigiPath and color thresholding in ImageJ.
The results from both DigiPath and standard thresholding were analyzed against a hand-traced standard. We found that the color thresholding method resulted in a tradeoff between sensitivity (accu-  Across three users, DigiPath consistently classified regions of microvascular obstruction in kidneys with significantly greater correlation to handtraced standards than was achieved using color thresholding ( Figure 2b).

| DigiPath enables quantification of multiple histological features across different stains
We next sought to assess the adaptability of DigiPath for quantification of a variety of different histological features between liver and kidney. We first assessed the ability of DigiPath to quantify the degree of steatosis in a series of three transplant-declined human livers ( Figure 3a). Livers 1 and 2 were declined for transplant due to the presence of steatosis, whereas Liver 3 did not list steatosis as a reason for decline (Table 1). We used DigiPath to quantify the area of fat droplets in biopsies from each liver.
DigiPath identified fat droplets in both Livers 1 and 2, and negligible droplet area in Liver 3 (Liver 1 median: 29.2%; Liver 2 median: 9.3%; Liver 3 median: 0.7%). Steatosis is reported as the cumulative area of fat droplets per image area (Figure 3a). DigiPath also allows quantification of the variability of steatotic areas within a single biopsy. We used DigiPath to analyze over 400 20Â images covering two sections of a biopsy from Liver 1 and found that the percent steatotic area in individual image fields ranged from less than 1% to over 60%, with a median of 29% steatosis ( Figure 3a). This demonstrates the capability of DigiPath to characterize the spatial variation of histologic features within a whole biopsy.
Similarly, we found that we could quantify the distribution of fibrosis in livers using the DigiPath tool on Sirius red stained biopsies.
With Sirius red, high collagen levels, associated with fibrosis, stain red in fibrotic and cirrhotic samples. We assessed two livers with stage  Table 1.

| DigiPath quantifies steatosis in experimental mouse livers
To confirm that DigiPath could quantify features of interest from histological specimens processed outside of our lab, we next sought to determine if DigiPath could accurately quantify previously published results. 18 In a model of murine hepatosteatosis, DigiPath was able to quantify the area of steatosis across a series of images from different animals   course of cold storage in some marginal human organs. 9 We applied TUNEL staining to biopsies collected from a pair of kidneys after  Figure 4).

| Kidney normothermic machine perfusion
Kidneys were prepared and perfused for 1 h on the ex vivo normothermic machine perfusion circuit as previously described. 27 Biopsies were collected at the end of the perfusion period.

| Biopsy and staining procedures
Wedge biopsies were collected and fixed in 10% formalin for a mini-

| Mouse model of steatohepatitis
C57BL/6J mice were from the National Cancer Institute as previously described. 18 All experiments were performed in specific pathogenfree facilities and were performed in accordance with the regulations

| Brightfield imaging
Three sections per biopsy were tiled at 20Â magnification using an EVOS FL Auto 2 microscope (ThermoFischer Scientific). All new images were captured as 24-bit RGB color images with 3.2 million pixels (12 MB) at a resolution of 58,522 pixels per inch. Following image collection, images were manually parsed into "edge" versus "continuous" images to distinguish images that were wholly contained within the section (continuous) versus images that were partially tissue and partially blank space (edge).
Edge images were excluded to avoid artifacts in analysis. Continuous images were then loaded into the program for quantification.

| Color threshold analysis
Images were analyzed in ImageJ. The "Threshold Color" window was opened, and a set of red, green, and blue thresholds was chosen in red/green/blue (RGB) space using the default thresholding method.
The selection was converted to a binary mask, which was saved for evaluating accuracy relative to hand tracing.  Each point represents one image field within a biopsy. The significance of differences between biopsies was calculated using oneway ANOVA.

| Evaluation of DigiPath performance
Area of microvascular obstructions was quantified in six 20Â images of kidneys stained with H&E using three quantitative methods: hand tracing, color thresholding, and DigiPath analysis. A set of three independent users analyzed the same six images using each method.
A consensus hand-traced standard was generated for each image by including areas that were selected by all three users. This was used as a standard for comparison to areas found by color thresholding or DigiPath. A confusion matrix was generated where the hand-traced classification of the images served as the "true" positive and negative values, and the classification by color thresholding or DigiPath were positioned as the "predicted" positive and negative values.
Metrics including sensitivity, specificity, and accuracy were calculated from each matrix. Three correlation coefficients, F-score, Matthew's correlation coefficient (MCC), and Youden's J statistic, were calculated to assess the performance of the color thresholding and DigiPath methods in the hands of each independent user. The significance of differences between methods was calculated using twoway ANOVA.

| Correlation metrics
Three correlation metrics were derived from a confusion matrix of all possible classification outcomes (FP-false positives, FN-false negatives, TP-true positives, and TN-true negatives). Youden's J statistic represents "informedness"-the probability that the program will make an informed decision. 28 This takes into account both sensitivity and specificity.
The F-score is a measure of accuracy and is derived from sensitivity and precision values. 29 MCC is a measure of both markedness and informedness and accounts for the proportion of occurrences of each possible classification outcome. 17,30,31 MCC

| ImageJ quantification
Immunofluorescent TUNEL images were processed using the Watershed and Analyze Particles functions in ImageJ. The total number of DAPI positive (blue) and TUNEL positive (red) cells were quantified.

| CONCLUSIONS
DigiPath is a highly adaptable tool that enables high-throughput, quantitative analysis of any color-defined histologic feature. DigiPath is available for download as a free app in the MATLAB File Exchange (www.mathworks.com/matlabcentral/fileexchange). The app is accessible to researchers regardless of their level of experience with coding and can be operated by users who are not formally trained in pathology. The ability to automatically detect features in histology images based on three-channel RGB color data enables a more quantitative approach to histological analysis, an experimental technique that is already essential in preclinical research.