HIGH content microscopy screening (HCMS) is being increasingly used in both gene interference screens with siRNA libraries and conventional chemical libraries (1–8). Image-based phenotypic screens are potentially more powerful than single parameter phenotypic screens, because multiple quantitative phenotypes can be measured, for example, cell cycle analysis may be coupled to information about cell proliferation/viability and nuclear morphology. The methodology at the core of this approach is the automated segmentation of images, acquired by conventional or fluorescence based microscopy, into elements of interest. Although counting of segmented objects is frequently used as a phenotypic measure, multiparameter intensity quantification of objects has so far been rarely used, due to biases arising from uneven illumination and local background variation across the imaging field and variability in staining across sample wells. Although such biases may have a small impact on object recognition (providing signal-to-noise is sufficiently high) they have a very significant impact where object intensities are required as part of the phenotype. These issues become of even greater significance when hundreds of thousands or millions of images need to be processed, as is typically the case for genome-scale screens. Thus, effective calibration methods are necessary for the extraction of meaningful intensity measures from a screen.
In high throughput imaging screens, there is a tradeoff between the quality of the images and the acquisition time of each image, imposed by the large number of images to capture, and the stability of the sample (cell or probe) over the duration of the experiment. Short acquisition times per field are paramount, and thus robust focusing methods such as software-based searches for optimal focal planes amongst multifocal plane images, or collecting a stack of multifocused images (for off-line analysis) is typically not performed. Hence, the collected images may not all be captured at optimum focus, which further increases the noise in the intensity estimates.
Intensity calibration methods are therefore essential to increase accuracy and overall sensitivity in a large screen. Although fluorescence intensity correction methods using a fluorescent solution consisting of evenly distributed fluorochromes (9) have been developed for slide-based experiments, these methods rely on the flatness of glass coverslips and are not easily applicable to the optically transparent multiwell cell culture plates that are used in large screens. Other correction methods that require images captured from the same cell at various location in the field of view (10) cannot be performed using the acquisition software in some commercial HCMS systems.
Multiparameter imaging of the cell cycle and nuclear morphology screens also require automated methods for partitioning the cell populations. For example, all fluorescence-based cell cycle assays require accurate estimation of DNA content and DNA synthesis from DNA binding fluorochromes and measurement of additional parameters (11). Well-to-well or plate-to-plate variances in overall measurements, such as overall fluorochrome intensity and effect-driven alterations in sub-population abundances, create significant problems for automated partitioning of cell intensity measurements, a problem analogous to gating in flow cytometry (7, 12). When hundreds of thousands, or millions of images are collected as is the case in genome wide screens, typical methods using manual (6–9) or semi-automatic partitioning (11–13) (e.g., using FlowJo, Tree Star Inc, Ashland, OR or ModFitLT, Verity Software House, Topsham, ME) are neither feasible nor objective. We have addressed these issues for plate-based multiparameter cell cycle analysis by implementing a simple, automated partitioning method of the corrected intensity estimates extracted from image-based cell cycle screens. Our method can adjust for variability in the assay over hundreds of thousands of images and avoids the complexities and errors in model specification and cluster number estimation that are inherent in cluster-based methods (14). A further practical issue is that most image correction and partitioning software sold commercially with imaging instruments cannot easily be modified or deployed in a distributed compute cluster environment, which is a limitation for large screening environments.
Taking the above into account, we sought to develop a simple, universal physical method of image calibration and background correction applicable to high-throughput image based screens in microtiter plates. We integrated this method with a standard microplate-based imaging assay of DNA content and BrdU incorporation, where intensity correction is absolutely required to achieve proper estimation of the proportions of cells in different phases of the cell cycle. To demonstrate the utility of our methods, we have implemented them on an image-based multiparametric cell cycle and morphology screen using the colorectal carcinoma derived cell line, HCT116 against 779 kinases targeting siRNAs. We show how image correction affects the quantitative measurement and partitioning of cell populations and we exemplify the ability to retrieve quantitative phenotypes under these screening conditions. Our source code and scripts are freely available for modification and extension by others.
MATERIALS AND METHODS
Preparation of beads
Multispectral beads (TetraSpeck fluorescent microsphere, 4 μm T7283, Invitrogen, Carlsbad, CA) and multi-intensity beads (InSpeck Blue:350/440 fluorescent microsphere, 2.5 μm, Invitrogen) were deposited at varying concentrations into separate wells of a 384-well optical quality (Greiner microclear) tissue culture plate, the same type of plate used in all subsequent assays. Phosphate-buffered saline (PBS) was mixed into the wells to disperse the beads within the well. The plate was centrifuged and the beads were allowed to settle to the bottom of the plate for at least one hour before images were captured.
HCT116 human colon carcinoma cells were maintained in McCoy's 5A media (Gibco, Invitrogen) supplemented with 10% fetal bovine serum (Sigma-Aldrich, St. Louis, MO) and 20 mM L-glutamine, at 37°C in a humidified atmosphere containing 5% CO2. For controlled experiments in arresting cells in G1 phase, cells were serum starved in media containing 0.1% FBS for 24 h. G2 arrest was induced by incubation for 12 h with 500 nM etoposide (13).
BrdU incorporation and DAPI staining
Cells were incubated with culture media containing 50 μM 5-bromo-2′-deoxybromouridine (BrdU) (Invitrogen, Carlsbad, CA), for 30 min at 37°C. The media was removed and the cells were washed with PBS, pH 8.0. Subsequent wash steps were also performed at pH 8.0. Cells were fixed by incubation with ice-cold 100% ethanol for 30 min. The ethanol was removed and the cells were washed with PBS. The cells were then incubated with 2 N HCl in PBS for 20 min at room temperature. The acid solution was removed and the cells were washed three times with PBS. After the final wash, the cells were incubated in PBS, pH 8.0 for 10 min to ensure that all traces of acid were removed. The buffer was then removed and replaced with the staining solution, which consisted of PBS containing 1% bovine serum albumin, 0.3% triton X-100, 0.8 μg/mL anti-BrdU antibody conjugated to Alexa-488 (Invitrogen) (i.e., 1:250 dilution of the stock antibody solution), and 1 μg/mL 4′,6-Diamidino-2-phenylindole dihydrochloride (DAPI, Sigma-Aldrich, St. Louis, MO). After an incubation period of 1 h, the staining solution was removed and the cells were washed twice with PBS. Finally, the PBS was replaced with Tris-buffered saline (TBS), pH 8.0.
siRNA screen of human protein kinases
The siRNA screen was performed using the human siGENOME Protein Kinase siRNA Library from Dharmacon (Chicago, IL), which consists of pools of four siRNAs directed against each of the 779 phylogenetically related kinases in the RefSeq database (NCBI). The siRNA library was distributed across three 384-well plates. The first four columns of each plate were reserved for controls. Positive transfection controls consisted of eight wells of siRNAs against PLK1 and eight wells of siTOX reagent (Dharmacon). Negative transfection controls consisted of 40 wells of transfection reagent only (Lipofectamine 2000 from Invitrogen, Carlsbad, CA) and eight wells of siGENOME Non-Targeting siRNA #2 (Dharmacon). The controls were arranged in the same interspersed pattern for each plate. The screen was conducted in duplicate in the HCT116 cell line by transfecting the cells with a final concentration of 25 nM siRNA per well in 384-well format. Cells were plated using a Multidrop Combi reagent dispenser from Thermo Fisher Scientific (Waltham, MA). Lipofectamine 2000 was used as the transfection reagent according to the manufacturer's instructions and was also used on any remaining wells that did not get any siRNA. A final volume of 0.05 μL Lipofectamine 2000 was used per well, in a total transfection volume of 25 μL. High throughput transfections were performed with the aid of a Hamilton Microlab Starlet liquid handling robot from Hamilton Company (Reno, NV). Three days after transfection, the plates were processed as described above, and fluorescence images from eight fields were collected per well.
An HCMS system (IN Cell Analyzer 1100, GE Healthcare, Piscataway, NJ) was used to collect images from the samples. Images were captured with a 10× objective lens (Plan Apo 10x/.45, Nikon Corp., Tokyo, Japan). Fluorescence images were captured using the following sets of excitation and emission filters: DAPI (HQ360/40, HQ535/50); Alexa488 (S475/20, HQ535/50), which gave the best signal response with a multibandpass dichroic filter (Q505LP, Chroma Technology Corp., Rockingham, VT). Exposure time was set to detect sufficient object resolution (at least 256 grey levels above the background level; typical exposures of ∼100 ms was used for our samples). For each image field, a single focal plane was captured using a hardware (laser/photodetector) auto-focusing algorithm, which estimated the surface where the cells were lying (15).
Cells were incubated for 30 min with 50 μM BrdU (Invitrogen, Carlsbad, CA), then collected by trypsinisation. After washing in PBS, cells were fixed by adding ice-cold 100% ethanol while gently vortexing, and incubated at room temperature for 30 min. Cells were then washed with PBS, incubated in 2 N HCl/PBS at room temperature for 20 min, washed twice with PBS and incubated in PBS/1% Tween-20 for 10 min. Cells were stained with a FITC-conjugated anti-BrdU antibody (eBioscience, San Diego, CA) diluted 1:50 in PBS/1% Tween-20 for 1 h on ice. Prior to analysis, cells were resuspended in RNase A (0.5 mg/mL) and propidium iodide (PI) (10 μg/mL) in PBS/1% FBS. Cell data were collected by flow cytometry using a Becton Dickinson FACScalibur instrument (BD, Oakville, ON) and BD Cell QuestPro software (v4.0.2). Data were analyzed using FlowJo software (v8.7.1, Tree Star Inc, Ashland, OR) with manual gating (16).
Image analysis overview
Most commercial image analysis software packages lack the flexibility for open extension/modification of the imaging algorithms and are also restricted in terms of compute cluster deployment. Both of these aspects are limitations for image-based genome wide functional screens. In our implementation (see Supplementary Materials Fig. S1), we decided to take advantage of a previously described series of image processing modules, CellProfiler, that are implemented in high level MATLAB code that is freely available and extensible. We augmented CellProfiler (17) with a number of our MATLAB (The MathWorks, Natick, MA) modules that add the following functions. First, we introduced an image preprocessing filter to compensate for local background variation, to improve the ability of traditional segmentation algorithms to extract objects situated over areas with varying background intensity in the field of view. Next, for the segmented objects, we implemented intensity measures to correct for local background variation and size-related measures. A typical full genome screen generates in the order of 109 data elements from image processing. We decided to store these in a scalable database platform and utilized MySQL (http://www.mysql.com) for this purpose. Measurement values derived from the image segmentation pipeline were exported to the database for storage and further analysis: to (i) correct the measured values based on the calibration images, (ii) extract the cell cycle parameters, and (iii) consolidate the measures and report the descriptors for each sample well. Retrieval of raw data from MySQL and subsequent analysis was conducted using the R statistical language (http://www.r-project.org). Image processing was implemented on a 108 CPU 64bit ROCKS (18) architecture compute cluster. The MATLAB image processing code implemented in CellProfiler and corresponding R scripts are freely available, on request and via (public svn server or SourceForge).
The variability of the background intensity around objects in the image can affect the accuracy of segmentation. Thus, we chose a conventional (Laplacian- or sharpening-like) approach to make the background level in the image more uniform so that objects were more readily and reliably detected using thresholding techniques. A finite spatial filter approach (19) was considered for improved performance, but was impractical as it would have required a training set of images representative of the numerous conditions the cells were subjected to in the screen. In the implementation of our simpler sharpening-like method, we first transformed the image by applying a low-pass filter (a median filter with a kernel size of a minimum of twice the diameter of the cell nucleus) to estimate the local background, and in effect to blur the image, and then subtracted this blurred image from the original. As a result, the transformed background was reduced to grey levels near zero, whereas the objects of interest remained at significantly higher intensity levels. This operation facilitated subsequent segmentation (using intensity and edge based methods that are currently available in high throughput analysis software) used to detect the objects of interest, and generated a more consistent segmentation mask, as the intensities between the objects and background were rendered more distinct from one another across the field of view. We examined, by visual inspection, global and adaptive methods including thresholding, mixture of Gaussians, the Ridler-Calvard method and the Otsu method as implemented in CellProfiler (17) on a random selection of images representing the most crowded and least crowded wells. We found empirically that the Otsu method (20) appeared to correctly segment better than 90% of objects in the center and edges of the fields (see Supplementary Materials Fig. S2). We did not attempt to systematically compare other segmentation methods as this was not the primary objective of this study.
Correcting object intensity for local background variation
Although the preprocessing step transforms the image such that background intensity is near zero, the intensities of the objects would also be modified by this process. To avoid this, we individually examined each detected object and estimated its intensity and background levels from the unprocessed image, to better adjust for variation in background induced by situations such as cells lying on debris or near air bubbles (Fig. 1b). To accomplish this, we employed a local background correction using a similar principle as that used in estimating the fluorescence intensities of telomere signals (21). We first dilated the mask for all objects (cells) by a number of pixels (e.g., n = 2 pixels) such that segmentation errors in locating the exact boundary could be accounted for. Next we dilated only the boundary of the object of interest by one pixel more than that previously used for all detected objects (e.g., n + 1 = 3 pixels). The remaining pixels that were present in this dilated object mask but not present in the dilated mask for all objects (i.e., at the edges of other objects) would represent the local background pixels for that object, Background(Objecti) (see Supplementary Materials Fig. S3 for a graphical representation). The number of pixels to dilate (e.g., n) was chosen such that the outer border of the dilated mask was sufficiently far from the edge of the object where the intensities may still have been much higher than the background level. The dilation level could have been derived theoretically using a system response function involving object magnification, estimates of the degree of blur and out-of-focus variations in images from the screen, and estimates of the amount of compensation needed to compensate for the points the segmentation algorithm sets for the edge of the object (i.e., at the highest gradient point or at the minimum intensity point). Instead, we opted for a simpler, empirical method by measuring the background level at different dilation levels of isolated cells and finding the dilation level (n) such that further dilations would not substantially change the level of the background estimate. The dilation level should not be allowed to extend too far from the object as it then may not be representative of the local background and may be affected by other cells nearby. Under the screen conditions described here, two pixels were found to be sufficient. The average of the background pixels was then calculated and used to represent the local background for correction, IBgnd(Objecti). In the event that no pixels were deemed to be background (for example, if cells were too crowded, resulting in touching or overlapping nuclear masks), the algorithm defaults to a conservative estimate using the background value of the entire image, IBgnd(image). To avoid this situation we attempted to plate cells at a density such that the cells did not grow to confluence prior to fixation. Finally, the corrected intensity of the object was given by the difference of the object intensity and the estimated background level of the object, Icxy as follows:
where n= # of pixels to dilate.
Estimating the global background and signal response
We implemented a physical instrument calibration method for determining global illumination and background by analyzing images of fluorescent calibration beads of defined intensity. These beads have similar fluorescence spectra as the fluorophores of interest and have a tightly constrained variance in bead-to-bead fluorescence. From a combination of images from a number of different fields (∼64), a 2D surface estimate of the background intensity levels, B(x,y) and a 2D surface estimate of the total bead signal intensity level, S(x,y), at spatial locations, (x,y), were determined. The first estimated surface constituted the background dark (no object) image and the second estimate constituted the background bright (full object intensity) image as if the entire field of view was evenly distributed with a homogeneous fluorescent media.
To extract the background, B(x,y), we first applied to each image, a median filter with a kernel at least twice the diameter of the bead signal. Since the beads were well dispersed in each image (fewer than 100 beads per image to avoid interference from clumping), most of the pixels within this filter kernel area belonged to the background even if there were a couple of beads present within this area. Under such conditions, the median value over this area is a better predictor of the background level as it is not affected by the extremities of a minimum filter (e.g., from dead pixels in the sensor, and artifacts in the image) or an averaging filter (which is skewed by the contribution of intensities from the beads which happens to be in the image). For each pixel location, we then averaged the background median intensities over the captured images to yield the representative background level at that xy location.
To estimate the bright image, S(x,y), we interpolated the extracted intensity level of the bead signals (using the preprocessing and intensity measurement algorithms previously described) from the combination of images to generate the 2D surface estimate. For this, we first extracted the mean and total intensity of each detected object, its xy location, and area. We then performed a rough rejection of any touching objects based on fitting a Gaussian distribution to the area and intensity histograms and rejecting objects that had an area or intensity greater than a threshold value (e.g., two standard deviations from the mean to reject signals that were too large or too bright). A 2D surface interpolation algorithm was then applied to the remaining single objects based on their intensity measures and xy locations. As degradation from illumination and aberration occurred gradually across the image, a low order polynomial fit (e.g., of order 5) was sufficient to capture irregularities in the system aberration response.
Correcting Object Intensity for Global Signal Variations
Since background intensity can constitute a significant portion of an object's measured intensity in high throughput screens, it should not be ignored or treated as negligible in the intensity correction scheme. The correction scheme we employed is very similar to the more generalized background correction algorithm used in bright-field microscopy where the global background estimate is used to correct the intensity measure instead of the local background measurement. In this generalized method, the corrected image (estimate of the true image) C(x,y) is calculated as a ratio of the observed signal of known objects I(x,y) minus the background B(x,y) and the bright signal estimate S(x,y) minus the background [Eq. (2)]. Since we had previously accounted for the local background in our intensity measurements, our corrected result was simply the ratio of the locally corrected signal Ic(x,y) divided by the global background or interpolated composite image of bead intensity distribution Sc(x,y). Assuming that the cells and beads are captured within the linear response range of the fluorophores (i.e., avoiding saturation and over-excitation of the fluorophores and avoiding large quantization errors with low signal-to-noise measurements) and that the fluorophore response of the object and the corresponding fluorophore of the bead are similar, the corrected image, C(x,y), now accounts and compensates for the differences in fluorescence response between the objects and the background. As the background did not change much over the small local region occupied by each object, it was assumed to be constant over this limited area and as a result the intensity correction was applied to the entire object as a whole (extracted data) rather than pixel by pixel.
Automatic Cell Cycle Gating
We first separated the S-phase population based on BrdU labeling and then partitioned cell ploidy based on the DNA (DAPI) distribution of the remaining cells. For the S-phase gating, we relied on the corrected intensity estimates (Fig. 2b) which showed less variance within BrdU channel measurements, than the uncorrected estimates (Fig. 2a). The corrected intensity estimates more consistently exhibited a clear bimodal distribution, thus allowing a simple threshold on the corrected BrdU histogram to separate the cells into populations of BrdU+ and BrdU− cells (Fig. 2c). To accomplish the thresholding automatically, we first applied a maximum filter to smooth out and remove the noise over the intensity distribution (i.e., the log scaled of the mean corrected BrdU pixel intensity of each cell to correspond to what is commonly plotted in flow cytometry). Other smoothing filters (had been tried) and can be used here to produce similar or identical effects. Once the leftmost maxima/peak of the filtered data was found (BrdU− peak), we searched on either side of this peak for the peak-width intersecting points (at one standard deviation of an assumed Gaussian distribution) and used the mid-point of these to revise the location of the peak. As the number of s-phase cells and the distribution of their intensities were seen to be quite different from well to well, a suitable (experimentally determined) threshold (horizontal dashed lines in Fig. 2b and vertical dashed lines in Fig. 2c) could not be globally specified, but rather needed to be locally adaptive. Based upon counting the number of objects with BrdU intensities in a neighborhood of the threshold level, we determined for a majority of the images that a distance of three times the standard deviation from the lower intensity peak yielded the optimal threshold in the sense that it produced lower object counts compared to thresholds obtained at two or four times the standard deviation. The error in setting the threshold at this location was small as there were relatively few objects with total intensities that were slightly above and below this threshold level. We examined an alternative approach, namely searching for the minimum between the BrdU+ and BrdU− peaks, but we found that the method described above was more robust over real screen conditions, where treatment condition effects do not always result in a predominant BrdU+ peak (e.g., JAK2 response in Fig. 4).
The total DNA histogram (linear scale, Fig. 2d) was then generated from the remaining cells that had BrdU levels below the s-phase threshold level (shaded region in Fig. 2c). To partition the G1 G2/M phase distributions, we first applied a Gaussian smoothing filter to smooth the noise in the histograms. Although the Gaussian smoothing filter would take longer to compute, it performed better than the average or rank based filters for handling the histogram distributions that we encountered in the screen. To avoid detection of a peak with very low DAPI staining, we first skipped over the first few data points that corresponded to levels below 0.3 times the intensity distribution mean, such that even if the distribution was predominantly in the G2 phase, a G1 peak (which should be at one half the intensity of the G2 peak) would still be detected. From this point, we then searched for the G1 peak and its peak-width intersecting points on either side. The intersecting points were then used to correct for the “true” location of the G1 peak. We found that using the intersecting points at half maxima did not consistently locate the G1 peak as the noise in some instances could merge the G1 and G2 data points into only one detectable peak, and as a result preventing proper detection. Hence, we opted to use the one standard deviation point from the peak, based on the Gaussian definition. The tangent to the histogram at this intersection point was then extended and used to predict the shape of the G1 peak. The objects contained within the boundaries of the estimate of the G1 peak were then removed from the distribution to better reveal the second (G2/M-phase) peak (e.g., shaded region in Fig. 2d). The location of the G2 peak was then found using a similar technique as the first. If the G2 peak was found within three times the value of the G1 peak, the algorithm selected the value between the G1 and G2 peak to partition the G1/G2 proportions (long dashed lines in Fig. 2d). Otherwise, the algorithm assumed that a G2 peak could not be detected from the distribution and defaulted to using half of the peak-width measure from the G1 peak to set the G1/G2 threshold. The distance between the G1/G2 boundary and the G1 peak was subtracted from the G1 peak to give the pre-G1/G1 boundary and was multiplied by 1.5 and added to the G2 peak to give the G2/polyploid boundary (short dashed lines in Fig. 2d). As our method extracts only the peak and peak-width measures and does not perform curve fitting to any given distribution, it can be executed quickly, which is highly advantageous when large sets of data need to be processed, as in whole genome screens.
For quality control of the automatic gating thresholds during analysis over an entire screen, we compared the threshold values of each well with others in the plate. We also compared the cell cycle parameters (e.g., the G1, s, and G2 fractions) of each well with that of its replicate well(s) on other plates. If the threshold boundaries or the discrepancy in cell cycle parameters in replicate wells were far away from the norm (i.e., s+/− and G1/G2 boundaries fell outside a predefined range around its median value or the ratio of cell cycle parameters in replicate wells fell outside the prescribed confidence interval of that evaluated for the replicate wells), these cases were flagged for manual review.
Assay Measurement Range Estimation (CNR and Z factor)
To estimate and compare the measurement range of our assay, we utilized the contrast-to-noise ratio (CNR) and the more commonly used Z factor (22). CNR is similar to the signal to noise ratio but the median and inter-quartile spread (IQS) values are used rather than the mean and variance of the data set, which are more influenced by outliers. Hence, given two populations (A,B) of objects (e.g., control and either serum starvation or etoposide treatment), the CNR is defined as the absolute difference of their medians divided by the mean of the inter-quartile difference of the two data sets.
The Z factor calculation, which is often used to assess the quality and performance of high-throughput assays (but is less informative when the noise process is not Gaussian) is given by the following formula using the mean, μ, and standard deviation, σ.
Calibration of Global Spatial Illumination by Latex Beads
To determine the optical aberration in the light path of our HCMS system, we avoided using a fluorescent media as a calibration standard (9) (as is performed in slide based systems), since the variable thickness of the plate bottom and the surface tension of the liquid would have given different thicknesses of media, resulting in increased variability of the field to field measurements within the sample well. We also decided against using a smoothed version of the minima or average of pixels over a large collection of typical cell images to estimate the background and aberration response, respectively (17) as this method is highly affected by extremities of bright/dark intensities. We found that even over several thousand images, significant spatial inhomogeneities result when this method is applied (Fig. 1g and 1h). The method of capturing images of the same object at different locations in the field of view and then estimating its intensity differences (10) would have given measurements that would be representative of the objects under study, but most high content imagers in current use lack the capability for users to acquire such images. We therefore chose to quantify the degree of spatial degradation of intensity measures using latex calibration beads. The spatial intensity distribution of latex fluorescent calibration beads, [known for their tight tolerances in size, spectrum and intensity (23)] over a field of view, indicates how signal intensity levels vary globally across the imaging field. As shown by the pseudo-color differences in the spatial distribution (Fig. 1a and 1c) and the intensity profile over the central x-axis of the field of view (Fig. 1e), the beads near the center are significantly brighter than those at the edges (50–200% more in total object intensity above background). From the interpolation of the combined image of individual beads (Fig. 1a), a global spatial illumination correction function was derived and applied [Eq. (2)] to produce a more homogenous intensity distribution across the imaging field (Figs. 1d and 1f). In comparison, the system response function obtained from the maxima of a collection of 256 cell images showed uneven patches (Fig. 1g); using a larger collection of 1,152 cell images yielded a smoother response function (Fig. 1h). The correlation of corrected intensity measures with the specified (manufacturer) levels of intensity for beads was linear over the specified signal across all regions of the imaging field (r2 = 0.9989), suggesting that our object intensity measurements can be accurately corrected using this methodology.
Integration of Local and Global Background Correction Improves Segmentation and Intensity Measurement
Systematic (global) background and object (local) background variation introduces noise and causes loss of accuracy in phenotype measurements. We implemented the preprocessing of images for local background variation and noticed an increase in detected objects, especially near the edges of the image and for objects lying on high intensity background features such as air bubbles. A median of 7% and a mean of 9% more objects were detected per image after preprocessing in our test set of 3,072 images acquired from 384-well plates of HCT116 cells treated only with lipofectamine vehicle, varying depending on the signal-to-noise ratio in a given data set and the degree of spatial degradation in the system. Additionally, the size of objects was more consistent when preprocessing was first applied (size variance reduced by 10% per image in our test set of images).
We next assessed the impact of the combined correction estimates (illumination and local background) on our ability to discriminate and measure cell cycle phases. First, we confirmed that signal in the BrdU channel when no BrdU was added to the medium (with Alexa488 anti-BrdU present), was indistinguishable from background. Furthermore the signal in the BrdU channel in this instance was well below the thresholds that were used to distinguish BrdU+ cells in BrdU stained samples. We then examined the BrdU and DAPI intensity distributions of cells (Fig. 2). We observed that without any intensity correction, there was a considerable amount of noise in the intensity measurements yielding histograms with poorly formed valleys between peaks (data not shown) and that there was only a slight improvement in the reduction of noise when only global background correction was applied (Fig. 2a). With the addition of local background correction, there was a significant reduction in noise, as shown by a tighter clustering of the data points, especially for BrdU− signal intensities (Fig. 2b). The higher measures of total BrdU levels of cells in G2 (the result of accumulating more background pixels due to their larger size) were subsequently better corrected, resulting in similar levels to cells in G1 (i.e., both should be BrdU−, leftmost column of Fig. 2b).
Moreover, with microplate-based cell cycle analysis, even after intensity corrections, the intensity distributions from high throughput low magnification screens of microtitre plates typically have more noise than the data collected from flow cytometry or data extracted from “selective, focused, and/or higher resolution” images. Therefore the extraction of the s-phase population as the region between the G1 and G2 peaks (Fig. 2d) from only the DNA distribution profile, as commonly performed in flow cytometry (e.g., using interactive gating with ModFitLT, Verity Software House, Topsham, ME), would not be robust. To define G1 and G2 populations more precisely based on DNA content, BrdU positive cells (region above horizontal dashed lines in Fig. 2b and to the right of the dashed lines in Fig. 2c) were removed from the DNA intensity histogram (Fig. 2d), resulting in more distinct population peaks for G1 (2N) and G2 (4N) cell populations. In most wells, we observed distinct clusters in the BrdU-DAPI intensity plots (left and center columns of Fig. 2). The DAPI intensity histogram of the remaining BrdU− cells was then used to further partition the population into the different phases of the cell cycle (vertical dashed lines in Fig. 2d). In instances where it was difficult to distinguish between the G1 and G2 peaks due to noise (right column in Fig. 2d), even after the BrdU+ cells had been rejected, our algorithm was still able to determine reasonable estimates of cell cycle fractions. We noticed that the cell cycle proportions extracted from either the corrected total BrdU intensities, or the corrected mean BrdU pixel intensities were similar (percentages within 2% of one another, data not shown). For routine screen analysis, we plot these data on linear scales; however, when compared with flow cytometry data, as in this report, we utilize a log scale representation to better display objects across the large data range. The log scale representation has no effect on the partitioning algorithms.
Automated Partitioning After Image Correction Allows Determination of Cell Cycle Proportions
A common approach in image based screens involves the partitioning of cell populations into fractions and then using these fractions to characterize the population. In the case of cell cycle analysis, partitioning is required to determine the fraction of cells in different phases of the cell cycle and additional measures such as the mean/median area and form factor of each cell phase. Manual cell cycle phase partitioning or interactive gating as typically performed in flow cytometry (e.g., FlowJo, Tree Star Inc, Ashland, OR), is not practical for a genome scale screen and most clustering methods suffer from the problem of varying cluster identification between samples (14). For our automated partitioning method, we sought to determine whether the measurement range of the cell cycle fractions was sufficient to allow discernment of changes in the cell cycle status of cells in microplate conditions. To establish the measurement range of the full protocol, we examined the effects of serum starvation (expected to cause cells to arrest in G1) and treatment with a topoisomerase inhibitor, etoposide (expected to cause G2 arrest) (Fig. 3). We compared the cell cycle fractions obtained by flow cytometry of cells grown in 6-well plates, with those obtained in 384-well plates (Fig. 3). In the case of S-phase, G1 and G2 fractions, a threefold CNR (Z factor > −0.8) was readily achieved and for some measures, a CNR above 10-fold (Z factor > 0.3) was obtained, thus defining the measurement/sensitivity range of the entire protocol under screen microplate conditions (Fig. 3d). We note that although the total number of cells observed in each well is (variably) decreased with these treatments, the cell cycle fractions nevertheless show sufficient precision and range for screening. The s-phase, G1 and G2 fractions of HCT116 cells were also analyzed by flow cytometry. It should be noted that to obtain enough cells for the flow cytometry comparison, cells were of necessity grown in 6-well plates and so the growth conditions are not exactly comparable. Despite these format differences the data (Figs. 3a and3b) show considerable agreement in the absolute proportions (differences below 7%) and more significantly in the comparative response to the treatment conditions.
Image Corrected Microplate-Based Cell Cycle Assay Can Retrieve Quantitative Phenotypes Under 384-Well siRNA Screen Conditions
One final verification step was conducted under screen conditions, namely to confirm the stability of the DAPI and BrdU channel measures over several hours of storage at room temperature. This was necessary because under real world screening conditions, the scanning of plates requires overnight/unattended loading of the INCELL imager with plates, over a 20-h period. We assessed the effects of neutralizing buffers and FITC vs. Alexa488 detection reagents on signal stability. Multiple washing of plates with alkaline PBS and storage of plates with a strong alkaline buffer (TBS pH 8.8) was found to be necessary to maintain stability of the signal over many hours (Supplementary Materials Fig. S4). The use of Alexa488 also significantly improved BrdU signal over a 24-h storage at room temperature.
To test the full set of methods under screen conditions, we conducted a sub-genomic siRNA knockdown screen of human protein kinases (779 genes) in the HCT116 cell line, as described above. The distribution of phenotypic effects in six quantitative image-derived measures of cell viability/proliferation, cell cycle state and nuclear size are shown in Figure 4a. The combined method of physical image correction, local background correction, and adaptive partitioning was able to retrieve a range of quantitative cell cycle phenotypes from siRNAs previously shown to affect these cell cycle parameters. For example, blockade of IGF receptor signaling is known to arrest cells in G1 (Fig. 4). Blockade of the protein phosphatase CDKN3, an inhibitor of cyclin dependent kinases, also arrests cells in G1 (24). The mitotic cyclin CDK10 is known to regulate the G2/M transition in human cells and was identified as a high ranking phenotype of G2 arrest (25). Within this sub-genomic scale screen we also noticed previously undocumented siRNA phenotypes. The bifunctional Janus kinases (JAKs) regulate signaling from transmembrane receptors and mutations in JAK2 have been identified as drivers in myelodyplastic syndrome. We identified JAK2 as an outlier for several parameters including the presence of a large population of >4N cells (Fig. 4, right) suggesting genome instability and endoreduplication. One study using broad spectrum JAK inhibitors (26) has suggested that inhibition of Janus kinases (especially JAK3) may induce endoreduplication in promyelocytic HL60 cells. However, it is unknown whether this would occur in nonmyeloid cells, whether the effects observed may have been off target effects of the drugs in use, or whether the effects are specific to JAK3. Our screen shows that specific inhibition of JAK2 by different methodology in a nonhaematopoetic cell type, also results in endoreduplication, thus further implicating the Janus kinases in this pathway. To verify that the JAK2 phenotype was reproducible by conventional cell cycle assay methods, we repeated the siRNA knockdown for JAK2 using pooled and individual siRNAs and measured the cell cycle fractions by flow cytometry. Figure 4 shows that, as with the image-based cell cycle measurement from the screen, siRNA knockdown of JAK2, induces a large population of endoreduplicated 8N cells. We reconfirmed this with two independent siRNA sequences (not shown) from the pool, indicating that an off target siRNA effect is unlikely. The cell cycle proportions observed by flow (no siRNA control: G1:31% s:41%, G2:22% and JAK2: G1:26% s:20% G2:35% postG2:13%) are mostly within 2% from the values obtained by image-based cell cycle analysis (control: G1:38% s:43%, G2:18% and JAK2: G1:28% s:22% G2:34% postG2:13%).
We noticed that the siRNA effects on the nuclear size (area) for the other cell cycle phases (data not shown) varied similarly as that of the G1 cells (Fig. 4a). Hence, by using the size estimate from only a sub-population of cells (e.g., G1) instead of the total cell population, the effect of the nuclear size is decoupled from the proportion of cells in different stages of their cell cycle (as G2 cells are typically larger than G1 cells).
Finally, we manually verified the performance of our automated cell cycle gating of this screen (of over 3,000 wells over eight plates). Although the measured fluorescence in the BrdU channel (and sometimes the DAPI channel) was occasionally very low (in about 8% of the wells, randomly distributed across the plates), our gating algorithm was still able to correctly gate most of these (97%). Of the remaining majority of wells that had sufficient fluorescence signal, there were only a few instances (0.5%) where the algorithm generated the wrong partition. For these instances, the boundaries were far away from the norm, and they were easily detected and flagged for manual review.
Accurate estimation of object intensities from high-content imaging is nontrivial and highly relevant to estimation of quantitative phenotypes. The image correction pipeline developed here consists of two steps, both transferable to other imaging assays. First, an illumination calibration function is determined physically using latex microbeads. Second, local background measures coupled to object segmentation are used to remove local variations in background for more reliable object detection and more accurate object intensity estimates. The preprocessing steps taken here are not limited to deployment in MATLAB/CellProfiler, but could also be employed by adding the preprocessing filter into any sufficiently adaptable software imaging processing pipeline. Without correcting for uneven illumination and optical aberrations inherent in the system and noise artifacts (e.g., air bubbles) that vary from image to image, only the central region of the image with a shallow illumination gradient and regions where there is no debris present should be used to extract accurate intensity measurements. However this leads to a loss of potentially valuable measurements. With image and illumination correction in place, measurements are more accurate, improving the measurement range of an assay. Under screen conditions, where tens or hundreds of thousands of wells (and millions of images) are collected, small errors in measurement can dramatically increase noise. A practical benefit of increased accuracy is that fewer images will need to be captured and fewer wells will need to be flagged for rescreening. Moreover the interpretation of genome-wide results from screening will be rendered more robust. We chose to exemplify these methods with image-based cell cycle measurement, but the instrument calibration and background correction may be applied to any microscopy-based assay in which object segmentation is used.
The methods we used can discriminate relative intensity differences within wells. We observed that the location of the G2 peak is not always precisely twice the estimated intensity of the G1 peak. One of the major contributing factors to this difference is that the background level is typically over-estimated when cells become crowded, or when images are not well-focused. In setting up each screen, much effort has been invested to determine the correct cell seeding density such that the most proliferate cells do not grow to confluence. Even in wells that are crowded, we observed that most of the cells are properly segmented (Supplementary Materials Fig. S2). Under screen conditions, variations in cell density are inevitable. Nevertheless, it is still clear from the DNA histogram plots that for the vast majority of wells, there is always a population of cells (e.g., G2) that has higher DNA content than the other (e.g., G1), and these observed proportions closely match those observed with flow cytometry as a measurement standard. Furthermore, it can be observed that in moderate density wells with poly-ploidy distributions, such as those observed with JAK2 (Fig. 4), the 2N, 4N, and 8N peaks are located at roughly the proper locations at one, two, and four times the intensity of the 2N peak.
For absolute intensity measurements, we would need to incorporate additional calibration standards to each sample well. Similar to our estimates and normalization of telomere lengths (21), we could use fluorescent beads to monitor the changes in the illumination intensities (aging bulb) and reference objects (e.g., addition of known cell type such as leukocytes incorporating a discriminating fluorescent protein) to account for variability in sample preparation and staining. As the global aberration function changes slowly over time (as the bulb ages), unless one realigns the illumination source or moves the objective lens or dichroic mirror out of alignment, the estimate of the aberration performed with the latex beads, need not be performed on every plate. Collection of a latex bead-based correction function whenever the bulb is changed and at the beginning and end of a large screen has been sufficient in our hands to achieve faithful image correction.
We have succeeded in estimating the signal response over the image field of view (global aberration function). Obtaining this image directly is particularly difficult in a HCMS system, as a fluorescence calibration signal that is evenly distributed over the field of view and across different fields of a sample well is required to generate such an image for analysis. In addition, the fluorescent calibration material should have consistent intensity response and have similar excitation and emission characteristics to the probe of interest since the optical aberration differs for different wavelengths and affects both the excitation and emission wavelengths. Hence, we opted to place beads in a similar multiwell plate as that used for the cells we screen and used the same tissue culture media to closely replicate the experimental conditions. If bead images are not available, an approximate aberration response could be obtained by using the average of a large collection of cell images (17), which have been first screened to remove outlying images where very bright or dark patches are present, and then smoothing it with a higher order polynomial smoothing function (e.g., five instead of the two that is programmed in CellProfiler) in order to capture the nonuniformities in the aberration function and yet avoid generating patchy areas observed with a median or Gaussian smoothing function.
Image-based analysis of the cell cycle has been described under screening conditions (3–5) for chemical libraries; however, no image correction was attempted and only a statistical comparison between the intensity distributions with a reference was performed in these studies to address the cell cycle parameters. Although one can use higher magnifications and invoke software auto-focusing to improve the quality of the captured images and hence better data for subsequent analysis, the time and expense incurred would make this impractical in high-throughput screening applications. Furthermore, the proprietary software used in these image captures does not allow for acquiring replacements for images that are out-of-focus, overcrowded or have excessive debris. Previous screens for cell cycle genes in mammalian cells (6, 7) have relied only on DNA measurement either with DAPI (with no physical image correction methods described) or laser cytometry (12, 13), where direct measurement of s-phase and cell shape/size parameters was not possible. By correcting for local background fluorescence, we were able to obtain data such that indicative quantitative cell cycle proportions similar to flow cytometry and other morphologic parameters (which cannot be obtained from flow cytometry) can be extracted to represent the cell population in the well. Our automated cell cycle gating technique is simple and performs this task effectively (attributes that play a major role in processing large sets of data as in genome wide screens). These algorithms are also robust to variations in stain intensities from well-to-well and plate-to-plate, and biological differences in DNA intensity distributions [unlike those that rely on the prior alignment of the intensity distribution with a reference (11)]. This eliminates bias and the time and resources that are inherent with manual segregation of the DNA distributions (7, 8). In the few instances (0.5%) where the automatic gating failed to locate the correct boundaries, the boundaries and the results generated were so far away from the norm that these cases were easily identified and flagged for manual review. These correction methods are generally applicable to high throughput image-based siRNA screens to enable accurate measurement of multiple phenotypes and to identify candidate siRNAs that showed significant effects.
Although many of the individual elements presented here have been described previously, we have linked these and reduced them to practical screening conditions for siRNA image-based screening. In addition to the image correction methods, some details of reduction to practice in 384-well plates must be controlled during the execution of the screen. For example, avoiding overcrowding of wells remains a key factor in the performance of the overall methods and careful attention to the stability of signal due to pH conditions in storage is required to stabilize signals in the cell cycle assay. Despite these details, we believe the image intensity correction and automated population partitioning will be applicable to other genome scale functional imaging screens where signal quantization and/or cell cycle analysis is required.