SEARCH

SEARCH BY CITATION

Keywords:

  • confocal microscopy;
  • image analysis;
  • image quantification;
  • subcellular imaging;
  • visualization

Abstract

  1. Top of page
  2. Abstract
  3. Spatial Imaging
  4. Imaging in Time
  5. Visual Data Representation
  6. Software Tools
  7. Conclusion
  8. Acknowledgments
  9. References

Fluorescent microscope imaging technologies have developed at a rapid pace in recent years. High-throughput 2D fluorescent imaging platforms are now in wide use and are being applied on a proteome wide scale. Multiple fluorophore 3D imaging of live cells is being used to give detailed localization and subcellular structure information. Further, 2D and 3D video microscopy are giving important insights into the dynamics of protein localization and transport. In parallel with these developments, significant research has gone into developing new methodologies for quantifying and extracting meaning from the imaging data. Here we outline and give entry points to the literature on approaches to quantification such as segmentation, tracking, automated classification and data visualization. Particular attention is paid to the distinction between and application of concrete quantification measures such as number of objects in a cell, and abstract measures such as texture.

Unlike many imaging methodologies such as X-ray crystallography that are intrinsically analytic and mathematical, fluorescent microscopy has been slower to take advantage of and to develop novel methods in analysis and quantification. This may in part be because of the fact that the results may immediately be 'seen’ and hence quantification may not appear essential. The diversity of users such as cell biologists, cancer researchers, neuroscientists and plant biologists also ensures that the field further suffers from a literature that is scattered in specialist journals and publications unlikely to be read by the cell biologist.

Hence, while there are several widely known quantification methodologies such as co-localization analysis, fluorescence resonance energy transfer (FRET), fluorescence correlation spectroscopy (FCS), fluorescence recovery after photobleaching (FRAP) and fluorescence lifetime imaging microscopy (FLIM), the range of methods applicable to fluorescent imaging of cells is much broader than may be apparent. In response to the recent advances in imaging technologies, new methods are being developed in automated classification, machine learning, image statistics, clustering, visualization, modelling, feature extraction, segmentation and object tracking to firstly deal with the scale of the data becoming available, but more importantly to find new ways to extract the information contained within the data sources and fully exploit their potential.

There is a wide range of reasons to want to quantify fluorescent imaging. One of the most important is the need to remove potential (unconscious) bias in data selection. A typical microscope may well contain upwards of 1000 cells, the majority of which will not be examined in detail when observing by eye, for instance, the localization of a protein. As well as selection bias, important data may be missed. Of those 1000 cells, a small proportion might be exhibiting a distinct or multiple localizations. If only 1–2% of the available data are being sampled, such effects will in all likelihood be missed and may have been the more interesting result. Similarly, quantification of large numbers of images gives the statistical power to detect subtle effects when comparing experiments. Upon stimulation of a pancreatic cell with sucrose, there might be a 5% drop in the number of insulin granules in the cell as the insulin is released into the intracellular environment; an effect that would be visually undetectable. However, with an automated granule counting assay, 100's of cells might be quantified under a variety of treatments and the compounds found that subtly change this response. More broadly, with whole proteome localization imaging now a reality (1), automated quantification and classification are becoming essential to deal with the growth in imaging data and remove the bottleneck of manual inspection. In the longer term, quantification is needed to enable the sorting, comparison and integration of the valuable data contained in the millions of fluorescent images that are now being generated each year. Just as database, searching and quantification methodologies have added great value to the sequencing revolution, similar tools for imaging will extend the range of biological conclusions that can be made. Finally, fluorescent imaging is potentially a rich data source for mathematical modelling. With the ability to observe and quantify multiple proteins simultaneously in a live cell context over time and under a range of conditions, there is now the data to begin to model and understand the systems biology of the cell.

The purpose here is to outline the main approaches and progress that is being made in the analysis of subcellular imaging, give entry points to the literature, and to identify some of the points at which further research is required. To a large degree, image analysis begins once an image set has been captured. In the following, a range of analysis options that might be applied to such an image set will be described. However, there is strong need for analysis options to be considered before the images are acquired. Firstly, as has been observed: 'tweaking microscope settings for 5 min could save months of tweaking algorithms’(2). But more importantly, awareness of the analysis options changes the range of experiments that will be attempted and conclusions that can be drawn. Further, simply posing the question of ‘how could the difference be quantified?’ can give invaluable insights into the data.

Spatial Imaging

  1. Top of page
  2. Abstract
  3. Spatial Imaging
  4. Imaging in Time
  5. Visual Data Representation
  6. Software Tools
  7. Conclusion
  8. Acknowledgments
  9. References

Abstract and concrete image quantification

Within fluorescent image analysis there are presently two main approaches to quantification measures. The first, and most well known, might be called concrete statistics. These include counting measures such as the number of structures in a cell, the volume occupied by a structure or the ratio of fluorescent intensity between regions. At the other end of the spectrum are abstract statistics to measure image content. These are abstract in the sense that they measure properties of an image such as texture or morphology, rather than the more concrete counting measures. One such set of image statistics are the Haralick texture measures (3), the essence of which is to quantify correlations between pixels at a given distance and angular separation.

The advantage in concrete measures is that it is immediately apparent what is being measured, and thus it is possible to make statements such as 'there was a 50% reduction in the count under treatment with compound X’. However, the choice of concrete measures is typically based on the expectations of the researcher, and hence unexpected distinctions may be missed. In contrast, abstract measures such as texture make fewer assumptions and tend to be more generic in the range of imaging that can be distinguished. But while abstract statistics may distinguish a wider range of experiments, what the actual difference is can be less clear. In the next section concrete statistics generation and their applications will be outlined, followed by a section on applications of abstract statistics.

Segmentation and quantification

Quantification from fluorescent imaging involves several stages, each of which may influence the results in another. A typical workflow might include sample preparation, image acquisition, image filtering to remove noise (4) or background, region or edge detection, quantification and data analysis (Figure 1). A good overview of many of the issues in each step may be found in (5). Of these steps, segmentation, that is the process of partitioning an image into multiple regions, typically with the aim of identifying objects or boundaries, is one of the more challenging. Once segmented, statistics such as number of objects, object sizes and intensity ratios are typically straightforward to extract.

image

Figure 1. Typical steps in (concrete) quantification of fluorescent imaging. Before image acquisition commences it is worth considering the kinds of analyses for which the imaging is required. For instance, if analysis is to be performed on a per cell basis, ensuring that the cells have low confluency on the plate can save considerable time by enabling automatic selection of individual cells. Similarly, if imaging experiments of treated/untreated cells, utilizing identical microscope settings such as exposure time will reduce the chance of detecting differences in imaging conditions rather than a true variation between experiments. Once imaging has been acquired, some filtering such as background subtraction to remove uneven illumination or a diffuse cytoplasmic signal, or median filtering to reduce noise can be appropriate. However, care should be taken that such filtering is not altering the truth of the experiment. Hence median filtering can be appropriate to enable better object selection, but filtering before taking intensity measurements might not be justifiable. Object detection then often involves testing several edge detection or segmentation methods and adjusting parameters to find the ones that produce the best results. Once segmentation has occurred generating statistics on the number, size, spatial distribution and so on is usually straightforward, and the data can then be analysed. As data sets become larger, data visualization techniques are becoming increasingly important. Visual comparison across more than four to five columns of data is difficult, and so it is often beneficial to represent the data utilizing colour, size, spatial and time dimensions to give the eye the opportunity for insight into the data. Finally, analyses will often suggest further avenues for experimentation or the need for more imaging in order that statistical significance can be achieved. In creating an analysis cycle it is to be recommended to test and adjust the pipeline on a small set of images before commencing on a large-scale image capture. Figure 3 gives a similar workflow for abstract image statistics.

Download figure to PowerPoint

While segmentation in general is a developed field, so that for instance many modern digital cameras will identify and automatically select and correct ‘red-eye’ in portrait photographs, segmentation of fluorescent imaging of cells is still very much a developing research area. This is in part because of technical difficulties such as the relatively low signal to noise ratios of fluorescent imaging and photobleaching. But the highly dynamic nature of subcellular structures and protein recruitment to those structures with radical variations and changes in apparent morphology also mean that methods of segmentation based on expectations about the morphology and light characteristics of the objects to be identified are rarely applicable except in cases such as regions like the nucleus in which the geometry is simple. Hence segmentation of cellular fluorescent imaging is largely based on either intensity threshold methods to select regions or intensity difference methods to find edges.

At the cell level, robust systems have been developed to automatically select individual cells from high-throughput 2D imaging, identify nuclear subregions and quantify proteins of interest within the regions found to distinguish phenotypes (6). At the nuclear level, while improving nuclear selection from 2D imaging is still an active area of research (7), automated nuclei selection has been applied to areas such as cell cycle regulation (8) and distinguish proliferating and malignant cells (9). At a finer grained level, considerable research has gone into segmenting and quantifying individual subcellular structures from imaging. Because of their relative structural simplicity and hence their amenability to techniques such as ‘Mexican hat filtering’ (Figure 2), there has been some success in quantification of punctate structures such as endosomes, peroxisomes and nuclear speckles (10,11). Mexican hat filtering (or Laplacian of Gaussian) is an edge detection method that can be ‘tuned’ by parameters to detect edges at different scales. Another useful technique to separate objects based on the topology of the image is watershedding (Figure 2). In this, discrete regions are found by ‘flooding’ from intensity peaks, and only joining regions if the ‘valley’ between them is sufficiently shallow. Such techniques are now standard tools in fluorescent image analysis packages such as ImageJ (see Table 2) and CellProfiler (12). For the reasons outlined above, there has been less success in segmenting non-punctate subcellular structures beyond thresholding, edge detection and watershedding schemes, although neurite segmentation is an exception (13). A wide range of methods with references can be found in Table 1 of (14).

image

Figure 2. Examples of object selection. A) An image of 4’,6-diamidino-2-phenylindole (DAPI) stained nuclei in HeLa cells (Image courtesy Teasdale Group, IMB). One approach to object detection is Laplacian of Gaussian (LoG) edge filtering, the result of which can be seen in (B). Edge detection via LoG first utilizes a Gaussian filter: each pixel in the image is replaced by a weighted average of the intensity of pixels in the local area, the weighting given by a Gaussian distribution centred on the pixel of interest. The effect of the Gaussian filtering is to smooth or blur the image, and choice of different widths of the Gaussian in the LoG filter may be utilized to extract features on different scales in the image. A Laplacian operator is then applied to the image to calculate the second derivative of the image. The second derivative is zero in flat regions of an image; positive on one side of an edge; negative on the other; and zero at some point between. Hence edges maybe found by determining the ‘zero-crossings’ of the LoG filter and gives results such as shown in (B). In practice, the calculation of the LoG intensity image often occurs in a single step with a convolution matrix being used to replace each pixels intensity with a weighted average of pixels in the local area. Interpreting the weights in the convolution matrix as ‘heights’, the result looks somewhat like a Mexican hat. Image (B) was generated using the LoG filter available as a plug-in to ImageJ. It can be seen in (B) that the LoG filter has found the majority of edges of the nuclei. Also, the soft glow between the two cells in the lower right corner has correctly not been detected. However, two closely adjacent cells in the centre of the image have been incorrectly segmented with a single boundary enclosing both. Watershedding may sometimes be used to cut objects that are touch. There are many variants of watershed methods, but here the one implemented in the ImageJ core is briefly described. The first step is to create a binary image for which white regions correspond to objects of interest and black to background. The image in (C) was constructed by first subtracting background using a rolling ball method (implemented in ImageJ) to remove unevenness in intensity of the background. An intensity threshold was then applied to create the binary image (C). C) The image might similarly have been produced by taking the results of the LoG filter in (B) and filling the enclosed regions with white. A distance map (D) is then generated from (C). The intensity of each pixel in (D) is proportional to the distance of the corresponding pixel in (C) to a black region. Hence the central regions of the nuclei become intensity peaks. The next step in the algorithm is to ‘flood’ from the intensity peaks. Distinct regions are then incrementally grown from each intensity peak until all non-background regions are covered. Where regions from distinct intensity peaks meet, boundaries are formed (E). It can be seen that the region delineating the two nuclei incorrectly segmented in (B) contains two peaks in (D) and a boundary between the two can be seen in (E).

Download figure to PowerPoint

Table 2.  A selection of open source software tools for fluorescent image analysis and storage.
ImageJrsbweb.nih.gov/ij/download.htmlImage analysis and quantification with many plug-ins
Cell Profilerwww.cellprofiler.org/download.htmHigh-throughput image quantification and analysis (6)
CPAnalystwww.cellprofiler.org/downloadCPA.htmImage analysis including machine classification (30)
OMEwww.openmicroscopy.org/site/downloadsVisualizing, analysis and managing microscope data (12)
BISQUEwww.bioimage.ucsb.edu/downloadsBio-image database and analysis
CellIDwww.molsci.org/protocols/software.htmlCell finding, tracking and analysis (48)
Murphy Labmurphylab.web.cmu.edu/software/Automated image classification and applications (25)
ImageSurfer152.19.37.82/main/Multichannel volume visualization and analysis (18)
iClustericluster.imb.uq.edu.au/Visual and statistical image differentiation (31)
Table 1.  A selection of commercial software tools for fluorescent image analysis and storage.
  1. Each of the image analysis tools supports are wide range of applications such as segmentation, intensity quantification, tracking and co-localization for multidimensional fluorescent imaging as well as specialized applications such as cell migration analysis, FRAP analysis and volume rendering for visualization.

MetaMorphwww.moleculardevices.com
Imariswww.bitplane.com
Volocitywww.improvision.com
Amirawww.amiravis.com
LSM Image Browserwww.zeiss.com

Imaging in two dimensions can be problematic for segmentation as objects that apparently overlap may be spatially separated in the third dimension. Hence 3D fluorescent imaging provides both opportunities in a more detailed view of subcellular structures and a greater amenability to segmentation and quantification, and is a developing area of research for segmentation. Using segmentation techniques such as gradient flows and coupled active surfaces nuclei may readily be segmented and quantified from 3D fluorescent imaging (15,16). Similarly, tools exist to count and quantify punctate structures in 3D imaging via watershedding techniques (17). Further examples and tools for segmentation and visualization of 3D fluorescent imaging may be found in (17,18) and references therein.

At present there is no universal solution to segmentation of fluorescent imaging. For the microscopist, the usual approach is to experiment in software such as ImageJ that supports a range of methods. If simple approaches such as thresholding fail because of background intensity variation or highly clustered objects, then edge detection or watershedding methods might be tried. If these fail, a literature search may turn up software methods that have been specifically designed for the imaging of interest. In some cases, small changes in experimental protocol or image capture setting may improve segmentation results. In this way, fluorescent image segmentation is still an experimental science involving an iterative process of testing and alteration of computational and experimental methods.

Classification and testing for difference

In understanding the functions of the tens of thousands of proteins being found by the sequencing revolution the most fundamental question is what does the protein do? The first steps towards this are where is the protein in a cell? and what does it interact with? Towards answering these, modern automated fluorescent microscopy offer an enormous depth and coverage of information: depth in that a single well may contain over a thousand cells that can be imaged in a few tens of seconds; and coverage in that whole proteomes may now be imaged. However, the number of images so obtained is overwhelming. In 2003, some 75% of the yeast proteome (4156 proteins) was screened and manually classified into 22 localizations (1). Further, it has been estimated that a complete human genome RNAi screen could be imaged in approximately 2 weeks, but would give rise to 106 images (19).

As a consequence of the wide range of phenotypes, concrete image statistics are not well suited to general problems of distinguishing subcellular imaging. Hence considerable effort has gone into abstract measures of fluorescent imaging. Conrad et al. (20) tested 448 different image features for their ability to distinguish images of subcellular localization and found that texture measures had the best performance in distinguishing a range of phenotypic imaging, and these form the foundation of the majority of current automated image classification systems. A common approach is via a statistical classifier such as a neural network (21) or support vector machine (22). Initially, a classifier is trained on the statistics of images of known (human classified) localization, and this is then is used to classify images of unknown localization. Several groups (20,23), including my own (24) (Figure 3), have taken this approach and have shown that correct classification rates of up to 98% (24) can be obtained on images of the major subcellular localizations. Further, automated classification results have surpassed human accuracy (25) and have been applied to the yeast proteome imaging (26). Similar approaches have been applied to 3D whole cell imaging and give comparable results (25), and more specialized classifiers have also been created to identify cell phase (27), mitotic patterns (28) and F-actin ruffles (29). Recently, facilities have been incorporated into the Cell Profiler Analyst software to interactively classify examples to train a machine learning algorithm that will then classify new examples (30).

image

Figure 3. Applications of image statistics to 2D subcellular localization image analysis. A high-throughput 2D fluorescent image set is acquired, possibly with a nuclear image. Depending upon the application, individual cells may be selected and cropped. Image statistics that measure texture and morphology such as Haralick texture measures or threshold adjacency statistics are then generated for each image. A vector of real numbers is then associated with each image and have a number of applications. Machine learning techniques such as support vector machines may be trained on images of known subcellular localization and then used to classify images of unknown localization (24). Images from treated/untreated experiments may be compared and p-values calculated for the null hypothesis of no change (31). The vectors may also be clustered and/or visualized to find the principle patterns of expression in an experiment and detect outlier images or clusters of images (32). Statistics vectors also give a measure of distance between images which can then be used to rank the match of an image to any others, and hence allow matching by content. Finally, for a given experiment an image to represent that experiment may be chosen in an unbiased way by choosing that image whose statistics are closest to the centroid of the statistics vectors for that experiment. Images courtesy Rohan D. Teasdale (IMB).

Download figure to PowerPoint

One difficulty with automated classification is that organelle structure can vary widely between each cell type, and thus classifiers usually need to be retrained for each cell type, although research is ongoing in removing this limitation (33). Another difficulty is that subcellular localization classes and representative training images for each need to be chosen before training. With protein localization often being a highly dynamic process with a protein exhibiting multiple localizations, or localization to subdomains at different or the same point in time, localization is not necessarily clearly defined. Hence assigning a designation ‘endosomal’ may be technically correct, but does not fully describe the situation. Thus, automated classification is to some extent fitting an image into a predefined box that may not reflect the true diversity of a protein's expression.

To better provide a view of the diversity of protein expression, attention is beginning to focus on clustering imaging using the statistical measures developed for classification. Here the aim is to find and group the principle patterns of expression in imaging for one or more proteins in much the same that sequence analysis and measures of sequence similarity may be used to define families of proteins. In (34), imaging of 188 clones of randomly tagged proteins in NIH 3T3 cells were found to group into 35 statistically significant clusters or location patterns using k-means clustering on their image statistics vectors. On the genome wide scale, in this way new patterns or families of proteins may be found that are not dependent on choosing localization 'boxes’(33).

A related question to identifying localization is detecting when localization has changed. A typical experiment would be to image a protein with and without co-expression of another protein to understand how they interact (35) or to image a protein or proteins under a range of drug treatments to screen for active compounds (36,37). In such cases it is not so important what the actual localization of the protein is so much as whether it has been perturbed by an introduced interaction. Image statistics may be used to measure how 'separated’ the statistics for two experiments are utilized. One approach is to examine the (statistical) neighbours of each image to determine whether they are on the same class (38). By employing permutation testing, a p-value for the null hypothesis of no difference between experiments may then be generated. Similarly, in my own research, the distance between the mean vectors for two experiments gave a measure of how separated experiments were, and permutation testing could then be employed to assign p-values for how unlikely that separation was under the assumption of the null hypothesis (31). With this approach it was possible to differentiate 10 distinct localizations in HeLa cells and detect relatively subtle changes such as endosomal redistribution.

Imaging in Time

  1. Top of page
  2. Abstract
  3. Spatial Imaging
  4. Imaging in Time
  5. Visual Data Representation
  6. Software Tools
  7. Conclusion
  8. Acknowledgments
  9. References

Live cell fluorescent video microscopy offers a wealth of information on the dynamic organization of proteins and subcellular structures that is unavailable in static 2D and 3D imaging. With the addition of time, organelle dynamics as proteins are recruited, transported and expelled can be viewed in detail and the passage though a cell of proteins and the structures that they interact with can be readily observed. However, while visual comparison of spatial structures for differences such as in size and morphology may be easily made if the differences are large enough, comparisons in time are more difficult, and hence quantification is essential to detect anything but the coarsest features of the image data.

Tracking

As with segmentation, object tracking from fluorescent video microscopy presents many challenges. Objects viewed may join, split, disappear, change direction or substantially change their morphology, and there are technical challenges such as photobleaching and compromises between spatial and temporal resolution. Typically, higher spatial resolution leads to better identification of the objects to be tracked, but reduces the time resolution and hence the ability to decide which object corresponds to which at distinct time-points. Further, depending on the markers used, the subcellular environment can appear complex and cluttered. Hence tracking algorithms developed in other research areas and adapted to fluorescent video microscopy tend to perform poorly (14) and considerable research has gone into designing algorithms specific to fluorescent imaging. Typical steps taken in object tracking are image acquisition, image filtering to enhance object detection, segmentation or object detection and finally matching of objects at different time-point to create paths. One advantage that tracking can have over other image quantification problems is that in most cases image filtering need only preserve the position of the detected object and not necessarily the structure. An excellent review of approaches taken is given in (14).

The art of tracking is in the matching of objects between images to create paths. At its simplest, an object is matched to the object that it is closest to in the successive image within a given radius (the expected maximum distance an object can move between frames). Variations allow objects to appear or disappear temporarily or permanently, or state the problem as a global optimization problem to minimize the total path lengths of objects, for instance. However, such an approach is likely to fail in environments in which the typical distances between objects are of the order of the distance an object may move between time-points. Technologies such as quantum dots (39) attempt to avoid this by introducing a few fluorescently bright dots to track. Improved tracking can occur by incorporating assumptions about the object tracked such as maximum changes in velocity, morphology or size. With such models, objects can often be tracked in surprisingly complicated environments. For instance, in (40) complex networks of microtubules could be tracked firstly by filtering to enhance lines and then utilizing the fact that the tips of microtubules either grow or shorten to track them. From such tracking, detailed statistics of microtubule behaviour could then be obtained.

Another approach is to track features (without segmentation) rather than objects. In SpotTracker (41), a particle is tracked in complex environments by considering all possible paths taken and a cost function to optimize involving path smoothness, distance and passing through bright pixels. Thus the particle is not segmented from the image, but the algorithm tracks a bright feature within certain constraints. This enabled telomeres to be accurately tracked despite potential confusion with the nuclear envelope that also appeared in the imaging. Combinations of segmentation and features have also been successfully applied to automate lineage tracking up to the 350 cell stage in Caenorhabditis elegans(42).

Possibly, the most ambitious tracking to date was that created to investigate the dynamics of promyelocytic leukemia nuclear bodies (PML NBs) in mitosis (43). In this work human osteosarcoma cells (U2 OS) were imaged in 3D over time with varieties of marker proteins. Nuclei at distinct time-points were then registered with each other by applying appropriate rotations and translations, the results segmented, and the PML bodies then tracked in each nucleus. This gave very detailed information on changes in the dynamics of PML NBs at stages of mitosis and associations with mitotic proteins.

Quantifying over time

While tracking and counting objects over time gives invaluable information about the movement and dynamics of cells and subcellular structures, intensity information within tracked objects can also be exploited. Two examples are given here, one at the cellular and one at the subcellular level.

At the cell and multicellular level, automated tracking has been combined with automated classification to elucidate the phases and timing of the mitosis (28). Multicell 3D image sequences in time of the chromosomal marker histone 2B-enhanced green fluorescent protein (EGFP) were generated. These were then automatically segmented into individual nuclei, tracked and mitotic events identified as points at which new tracks were initiated. Each nuclei at a time-point was also classified into one of seven cell cycle classes utilizing automated texture based classification techniques similar to those described earlier. This enabled automated analysis of the duration of the phases of the cell life cycle in high throughput, and has the potential to be applied to high-throughput RNAi screens to explore the coordination of mitotic processes.

At the subcellular level, in my group's collaborations, 2D and 3D video microscopy has been used to study the role of 3-phosphoinositides in macropinocytosis (44). A typical experiment involves two fluorescent markers: dextran to fill and delineate the region of the macropinosome and a marker such as GFP-2xFYVE to track phosphatidylinositol-3-phosphate (PI(3)P). The dextran channel is used to create a mask of the macropinosome and track its movements, and within this mask the average intensity within the PI(3)P channel could be calculated. In this way the rate of recruitment, the time of retention and the rate of expulsion of phosphoinositides from the macropinosome could be automatically obtained. Combinations of phosphoinositide markers could be used to show quantitatively the order and timings of recruitments and expulsions from the macropinosomes.

Visual Data Representation

  1. Top of page
  2. Abstract
  3. Spatial Imaging
  4. Imaging in Time
  5. Visual Data Representation
  6. Software Tools
  7. Conclusion
  8. Acknowledgments
  9. References

As high-throughput imaging and analysis becomes more commonplace, there is a need to develop a language of data representation and visualization to make sense of and convey the meaning of the multidimensional data. New forms of data require new forms of representation. As noted by Edward Tufte, a pioneer in the field of data visualization, “At their best, graphics are instruments for reasoning about quantitative information”(45).

With many fields having utilized 3D and 4D imaging, numerous tools exist to surface and volume render (18), but techniques need to be adapted to visualizing the information of interest to the fluorescent microscopist. In the dense environment of the cell relative, motility of even segmented and rendered objects observed can be difficult to ascertain when viewed as a movie. One approach to overcoming this is to use time as a spatial dimension. In (46), vesicles were segmented and tracked from 3D subcellular movies, and the dimensionality reduced by z-projection giving a 2D image for each time-point. These were then visualized in 3D with the third spatial dimension being the time-point from the movie. Hence stable vesicles appeared as long straight cylinders, while more motile vesicles would show greater curvature in the time dimension enabling a fast visual assessment of the motility state of a large number of vesicles. In my groups work we have used similar techniques to visualize the growth and retraction of tubules from vesicles during endocytosis (47). The advantage in transforming the time dimension into a spatial one is that all of the data can be viewed, compared and the relationships between objects and the timing of events seen at a glance.

In another visualization technique borrowed from phylogeny, statistics generated to quantify imaging have been used to define distances between images, and hence generate ‘phylogenetic’ trees for imaging. In (9), this approach was used to cluster and creates similarity trees for confocal images of breast epithelial cells, and in (34) a consensus subcellular localization tree was created for imaging from 126 wells of randomly tagged 3T3 cells. In this way it was possible to see the relationships between images, but also the hierarchical structure naturally created classes such as ‘punctate’ as unions of several localization classes. Along similar lines, in my group, we are interested in comparing and reviewing high-throughput imaging. Towards this the iCluster high-throughput subcellular localization imaging visualization and clustering tool was developed (Figure 4) (31,32). In the software, large image sets from single or multiple experiments may be loaded, statistics generated for each image and then mapped into two or three dimensions in such a way as to preserve the distances between the statistics vectors. In this way images that are statistically similar are spatially close, and dissimilar images are distance, thus allowing the full range of patterns of expression of experiment(s) to be readily observed. Outliers and unusual cells are then easily detected, and differences between treated and untreated experiments can be seen as spatial separation.

image

Figure 4. Visualizing to make sense of high-throughput imaging. One thousand and four hundred images of 10 subcellular localizations spatially arranged by iCluster are shown (31,32). For each image, threshold adjacency statistics and Haralick texture measures are generated to associate a vector with that image. Vectors are Sammon mapped into three dimensions such that distances between vectors are preserved to a high degree. Images are then visualized at the coordinates so found. Each border colour represents a different subcellular localization. It can be seen that images of the same localization are largely clustered together, hence outliers and distinct patterns of localization are visually readily detectable. iCluster also provides facilities for representative image selection, statistical testing for difference between image sets and image reclassification. While developed for subcellular imaging, iCluster can be applied to any objects for which there are high-dimensional data that need to be visualized in low dimensions to observe the relationships. Cell images courtesy Rohan D. Teasdale (IMB).

Download figure to PowerPoint

Software Tools

  1. Top of page
  2. Abstract
  3. Spatial Imaging
  4. Imaging in Time
  5. Visual Data Representation
  6. Software Tools
  7. Conclusion
  8. Acknowledgments
  9. References

In any rapidly developing field requiring computational support, software to implement new methodologies is inevitably a significant problem. Several commercial solutions for analysis of fluorescent imaging are now available and provide a wide range of functionality (Table 1). However, the rise of several large-scale open source projects such as ImageJ, Cell Profiler and the Open Microscopy Environment (Table 2) are now beginning to provide a powerful alternative. ImageJ provides functions for commonly performed image analysis tasks such as thresholding, particle detection, watershedding, region selection, intensity quantification and so forth; Cell Profiler (6) is principally for high-throughput image quantification and automated analysis; and the Open Microscopy Environment (12) supports data management for light microscopy and has components for storing, visualizing, managing, and annotating microscopic images and metadata.

The importance of these open source projects is that they provide a high quality common foundation and environment for sharing methods and tools in bio-image analysis that can be built upon and verified. Each provides a plug-in architecture or application programming interface which enables researchers to easily contribute new methods as they are developed and exploit a core of library of functions and plug-ins previously contributed. For instance, ImageJ has some 500 contributed plug-ins and 300 macros. Macro creation and recording facilities allow even the novice user to establish and distribute analysis pipelines using combinations of functions.

Another significant advantage in the open source analysis and storage projects is in interoperability. Hence ImageJ, Cell Profiler and Open Microscopy Environment are ‘aware’ of each other, and data may be readily moved from one to another to exploit the specific features of each. Similarly, the open programming interfaces mean that as other analysis or storage solutions are developed, these may be easily integrated with those already developed.

Just as researchers build on, transmit and verify knowledge through publication, it is essential that analysis methods and software can be built upon, transmitted and verified through common open source projects such as have been described here.

Conclusion

  1. Top of page
  2. Abstract
  3. Spatial Imaging
  4. Imaging in Time
  5. Visual Data Representation
  6. Software Tools
  7. Conclusion
  8. Acknowledgments
  9. References

Bio-image analysis is proving a powerful tool in both dealing with volume of imaging arising from fluorescent microscopy as well as maximizing the information extracted from the data sources. Much quantification of fluorescent imaging is currently used to distinguish: Does the number of endosomes change under treatment with a compound? Does treatment change the velocity profile of the actin comet? And from such data, interactions are inferred. The difficulty with the approach is that each protein or subcellular molecule is typically involved in a complex web of interaction networks and so determining the nature of the effect on the interaction network that gave rise to the observed change can be problematic. But as quantification methods become more commonplace and sophisticated, the next step is to use the quantified data in combination with mathematical modelling in order that the interaction networks may be teased apart. A beautiful example of this is given in the fluorescent imaging and mathematical modelling of oscillations in nuclear factor κB (NF-κB) localization between nucleus and cytoplasm (49,50). Combining detailed fluorescent imaging data and modelling it was shown that the NF-κB system is oscillatory and uses delayed negative feedback to direct nuclear-to-cytoplasmic cycling of transcription factors. In my groups research, the simple geometric information available from live cell video microscopy proved an extraordinarily rich source of information to build mathematical models and infer biologically relevant data (Figure 5). On the whole cell scale, research is beginning into generative models of subcellular localization by first quantifying the range of morphologies and spatial distributions of structures such as the plasma membrane, nucleus and lysosomes, and then utilizing the statistics of the distributions to generate synthetic models of cells (51). Once high-resolution spatial and temporal maps of cellular distribution combining multiple proteins have been created this will provide a foundation from which to model and understand the systems biology of the cell.

image

Figure 5. Dynamic geometric modelling from video microscopy. (Top) A single frame from a video microscopy movie consisting of 90 images of tubule formation during endocytosis (47). The dark circle (white arrow) is a vesicle that has just formed at the cell's surface. Long tubular extensions form from the vesicle surface and often exhibit multiple branches. Image courtesy Markus Kerr. (Bottom) A dynamic membrane conservation model of the system was developed (52) in which membrane from the vesicle surface is extruded into the tubules hence reducing the surface area of the vesicle. A frame from a visualization is shown. Experiments with nocodazole treated cells (nocodazole blocks tubule formation) had shown vesicular structures to be relatively static in size. Hence a reasonable assumption was that no membrane was entering or leaving the vesicle systems except via the tubular extensions; in other words membrane was conserved. Note that radii of tubular extensions were measured in separate micrograph imaging as 18.36 ± 4.42 nm. This is below the resolution limit for the fluorescent imaging and hence was used in place of measurement from the fluorescent imaging. Despite the simplicity of the model, it led to a number of quantitative predictions about the system such as an eightfold concentration of the contents of the vesicle and a pH change of 0.9 over the course of the observations. Modelling the surface of the vesicle as having subdomains of membrane available to form tubules led to an understanding of the rate of decease of the vesicle radius as well as qualitative features such as longer branching tubules observed earlier and short non-branching tubules later in the experiment.

Download figure to PowerPoint

Acknowledgments

  1. Top of page
  2. Abstract
  3. Spatial Imaging
  4. Imaging in Time
  5. Visual Data Representation
  6. Software Tools
  7. Conclusion
  8. Acknowledgments
  9. References

The author would like to thank Dr Rohan D. Teasdale (UQ) and Dr Markus C. Kerr (UQ) for their help in the editing and preparation of this manuscript.

References

  1. Top of page
  2. Abstract
  3. Spatial Imaging
  4. Imaging in Time
  5. Visual Data Representation
  6. Software Tools
  7. Conclusion
  8. Acknowledgments
  9. References
  • 1
    Huh WK, Falvo JV, Gerke LC, Carroll AS, Howson RW, Weissman JS, O'Shea EK. Global analysis of protein localization in budding yeast. Nature 2003;425:686691.
  • 2
    Auer M, Peng H, Singh A. Development of multiscale biological image data analysis: review of 2006 International Workshop on Multiscale Biological Imaging, Data Mining and Informatics, Santa Barbara, USA (BII06). BMC Cell Biol 2007;8(Suppl. 1):S1.
  • 3
    Haralick RM. Statistical and structural approaches to texture. Proc IEEE 1979;67:768804.
  • 4
    Broser PJ, Schulte R, Lang S, Roth A, Helmchen F, Waters J, Sakmann B, Wittum G. Nonlinear anisotropic diffusion filtering of three-dimensional image data from image data from two-photon microscopy. J Biomed Opt 2004;9:12531264.
  • 5
    Ronneberger O, Baddeley D, Scheipl F, Verveer PJ, Burkhardt H, Cremer C, Fahrmeir L, Cremer T, Joffe B. Spatial quantitative analysis of fluorescently labeled nuclear structures: problems, methods, pitfalls. Chromosome Res 2008;16:523562.
  • 6
    Carpenter A, Jones T, Lamprecht M, Clarke C, Kang I, Friman O, Guertin D, Chang J, Lindquist R, Moffat J, Golland P, Sabatini D. CellProfiler: image analysis software for identifying and quantifying cell phenotypes. Genome Biol 2006;7:R100.
  • 7
    Gudla PR, Nandy K, Collins J, Meaburn KJ, Misteli T, Lockett SJ. A high-throughput system for segmenting nuclei using multiscale techniques. Cytometry A 2008;73:451466.
  • 8
    Stacey DW, Hitomi M. Cell cycle studies based upon quantitative image analysis. Cytometry A 2008;73:270278.
  • 9
    Long FH, Peng HC, Sudar D, Lelievre SA, Knowles DW. Phenotype clustering of breast epithelial cells in confocal images based on nuclear protein distribution analysis. BMC Cell Biol 2007;8(Suppl. 1):S3.
  • 10
    Pham TD, Crane DI, Tran TH, Nguyen TH. Extraction of fluorescent cell puncta by adaptive fuzzy segmentation. Bioinformatics 2004;20:21892196.
  • 11
    Niemisto A, Selinummi J, Saleem R, Shmulevich I, Aitchison J, Yli-Harja O. Extraction of the number of peroxisomes in yeast cells by automated image. Conf Proc IEEE Eng Med Biol Soc 2006;1:23532356.
  • 12
    Schiffmann DA, Dikovskaya D, Appleton PL, Newton IP, Creager DA, Allan C, Nathke IS, Goldberg IG. Open microscopy environment and findspots: integrating image informatics with quantitative multidimensional image analysis. Biotechniques 2006;41:199208.
  • 13
    Abdul-Karim MA, Roysam B, Dowell-Mesfin NM, Jeromin A, Yuksel M, Kalyanaraman S. Automatic selection of parameters for vessel/neurite segmentation algorithms. IEEE Trans Image Process 2005;14:13381350.
  • 14
    Kalaidzidis Y. Multiple objects tracking in fluorescence microscopy. J Math Biol 2009;58:5780.
  • 15
    Li G, Liu T, Tarokh A, Nie J, Guo L, Mara A, Holley S, Wong S. 3D cell nuclei segmentation based on gradient flow tracking. BMC Cell Biol 2007;8:40.
  • 16
    Dufour A, Shinin V, Tajbakhsh S, Guillen-Aghion N, Olivo-Marin JC, Zimmer C. Segmentation and tracking fluorescent cells in dynamic 3-D microscopy with coupled active surfaces. IEEE Trans Image Process 2005;14:13961410.
  • 17
    Gniadek TJ, Warren G. WatershedCounting3D: a new method for segmenting and counting punctate structures from confocal image data. Traffic 2007;8:339346.
  • 18
    Feng D, Marshburn D, Jen D, Weinberg RJ, Taylor RM II, Burette A. Stepping into the third dimension. J Neurosci 2007;27:1275712760.
  • 19
    Wollman R, Stuurman N. High throughput microscopy: from raw images to discoveries. J Cell Sci 2007;120:37153722.
  • 20
    Conrad C, Erfle H, Warnat P, Daigle N, Lorch T, Ellenberg J, Pepperkok R, Eils R. Automatic identification of subcellular phenotypes on human cell arrays. Genome Res 2004;14:11301136.
  • 21
    Bishop CM. Neural Networks for Pattern Recognition Oxford Oxford University Press 1995.
  • 22
    Cortes C, Vapnik V. Support vector networks. Mach Learn 1995; 20:273297.
  • 23
    Boland MV, Markey MK, Murphy RF. Automated recognition of patterns characteristic of subcellular structures in fluorescence microscopy images. Cytometry 1998;33:366375.
  • 24
    Hamilton N, Pantelic R, Hanson K, Karunaratne S, Teasdale RD. Fast automated cell phenotype image classification. BMC Bioinformatics 2007;8:113.
  • 25
    Huang K, Murphy RF. From quantitative microscopy to automated image understanding. J Biomed Opt 2004;9:893912.
  • 26
    Chen S-C, Zhao T, Gordon GJ, Murphy RF. Automated image analysis of protein localization in budding yeast. Bioinformatics 2007;23:i66i71.
  • 27
    Wang M, Zhou X, Li F, Huckins J, King RW, Wong ST. Novel cell segmentation and online SVM for cell cycle phase identification in automated microscopy. Bioinformatics 2008;24:94101.
  • 28
    Harder N, Eils R, Rohr K, Kevin FS. Automated classification of mitotic phenotypes of human cells using fluorescent proteins. Methods in Cell Biology Academic Press, New York 2008;pp. 539554.
  • 29
    Yi Q, Coppolino MG. Automated classification and quantification of F-actin-containing ruffles in confocal micrographs. Biotechniques 2006;40:745746 8, 50 passim.
  • 30
    Jones TR, Carpenter AE, Lamprecht MR, Moffat J, Silver SJ, Grenier JK, Castoreno AB, Eggert US, Root DE, Golland P, Sabatini DM. Scoring diverse cellular morphologies in image-based screens with iterative feedback and machine learning. Proc Natl Acad Sci U S A 2009;2:2.
  • 31
    Hamilton N, Wang J, Kerr MC, Teasdale RD. Statistical and visual differentiation of high throughput subcellular imaging. BMC Bioinformatics 2009;10:94.
  • 32
    Hamilton N, Teasdale RD. Visualizing and clustering high throughput sub-cellular localization imaging. BMC Bioinformatics 2008;9:81.
  • 33
    Newberg J, Murphy R. A framework for the automated analysis of subcellular patterns in human protein atlas images. J Proteome Res 2008;7:23002308.
  • 34
    Garcí a Osuna E, Hua J, Bateman N, Zhao T, Berget P, Murphy R. Large-scale automated analysis of location patterns in randomly tagged 3T3 cells. Ann Biomed Eng 2007;35:10811087.
  • 35
    Fink JL, Karunaratne S, Mittel A, Gardiner D, Hamilton N, Teasdale RD. Towards defining the nuclear proteome. Genome Biol 2008;9:R15.
  • 36
    Cohen AA, Geva-Zatorsky N, Eden E, Frenkel-Morgenstern M, Issaeva I, Sigal A, Milo RCohen-Saidon C, Liron Y, Kam Z, Cohen L, Danon T, Perzov N, Alon U. Dynamic proteomics of individual cancer cells in response to a drug. Science 2008;322:15111516.
  • 37
    Lang P, Yeow K, Nichols A, Scheer A. Cellular imaging in drug discovery. Nat Rev Drug Discov 2006;5:343356.
  • 38
    ZhaoT, SotoS, MurphyR, editors. Improved comparison of protein subcellular location patterns 3rd IEEE International Symposium on Biomedical Imaging: Nano to Macro. Arlington, Virginia IEEE 2006.
  • 39
    Bonneau S, Dahan M, Cohen LD. Single quantum dot tracking based on perceptual Grouping using minimal paths in a spatiotemporal volume. IEEE Trans Image Process 2005;14:13841395.
  • 40
    Altinok A, Kiris E, Peck AJ, Feinstein SC, Wilson L, Manjunath BS, Rose K. Model based dynamics analysis in live cell microtubule images. BMC Cell Biol 2007;8(Suppl. 1):S4.
  • 41
    Sage D, Neumann FR, Hediger F, Gasser SM, Unser M. Automatic tracking of individual fluorescence particles: application to the study of chromosome dynamics. IEEE Trans Image Process 2005;14:13721383.
  • 42
    Bao Z, Murray JI, Boyle T, Ooi SL, Sandel MJ, Waterston RH. Automated cell lineage tracing in Caenorhabditis elegans. Proc Natl Acad Sci U S A 2006;103:27072712.
  • 43
    Chen Y-CM, Kappel C, Beaudouin J, Eils R, Spector DL. Live cell dynamics of promyelocytic leukemia nuclear bodies upon entry into and exit from mitosis. Mol Biol Cell 2008;19:31473162.
  • 44
    Kerr MC, Wang JTH, Hamilton N, Jeanes A, Yap AS, Meunier FA, Brown N, Stow JL, Teasdale RD. 3-Phosphoinostides have sequential and discrete roles during Macropinocytosis. 2008;(In press).
  • 45
    Tufte ER. 2nd edn. The Visual Display of Quantitative Information Cheshire, CT Graphics Press 2001.
  • 46
    Racine V, Sachse M, Salamero J, Fraisier V, Trubuil A, Sibarita JB. Visualization and quantification of vesicle trafficking on a three-dimensional cytoskeleton network in living cells. J Microsc 2007;225:214228.
  • 47
    Kerr MC, Lindsay MR, Luetterforst R, Hamilton N, Simpson F, Parton RG, Gleeson PA, Teasdale RD. Visualisation of macropinosome maturation by the recruitment of sorting nexins. J Cell Sci 2006;119(19): 39673980.
  • 48
    Gordon A, Colman-Lerner A, Chin TE, Benjamin KR, Yu RC, Brent R. Single-cell quantification of molecules and rates using open-source microscope-based cytometry. Nat Methods 2007;4:175181.
  • 49
    Mullassery D, Horton CA, Wood CD. Single live-cell imaging for systems biology 9. Essays Biochem 2008;45:121134.
  • 50
    Nelson DE, Ihekwaba AE, Elliott M, Johnson JR, Gibney CA, Foreman BE, Nelson G, See V, Horton CA, Spiller DG, Edwards SW, McDowell HP, Unitt JF, Sullivan E, Grimley R, Benson N, Broomhead D, Kell DB, White MR. Oscillations in NF-kappaB signaling control the dynamics of gene expression. Science 2004;306:704708.
  • 51
    Zhao T, Murphy RF. Automated learning of generative models for subcellular location: building blocks for systems biology. Cytometry Part A 2007;71A:978990.
  • 52
    Hamilton N, Kerr MC, Burrage K, Teasdale RD. Analysing real-time video microscopy: the dynamics and geometry of vesicles and tubules in endocytosis In: MorganK, editor. Current Protocols in Cell Biology Wiley Interscience, New York 2007.