Advances in genomics research, combinatorial chemistry, and laboratory automation have made it possible to rapidly screen large compound collections (Croston,2002), but these efforts have not resulted in increased research productivity (Lawrence,2007). An increasingly popular sentiment is that better models are needed to improve the discovery of new drug candidates, and it has been proposed that whole organisms could provide such models (Lansbury,2004; Kamb,2005; Sams-Dodd,2005). Model organisms have long served as a tool for drug discovery, but whole organism screens have not yet found wide application in high-throughput chemical genetics applications. Nonetheless, because of the growing interest in systems biology and the recognition of the complexities of biological pathways, there is increasing enthusiasm for exploiting organism-based phenotypic assays in high throughput screening (HTS) format.
A wide variety of assay tools have been successfully tailored to the analysis of zebrafish embryonic development (Pichler et al.,2003), and methodologies from zebrafish genetic screens are adaptable to chemical library screening (Haffter et al.,1996; Peterson et al.,2000). The zebrafish embryo is an especially attractive tool because it is small in size, optically transparent, and can be kept alive in multi-well plates for several days without the need for additional nutrients. In addition, zebrafish are easy to raise, yield a large number of offspring, and the instrumentation required to perform these screens has been reported to be less extensive than that found in many HTS laboratories (MacRae and Peterson,2003; Zon and Peterson,2005).
Despite the potential of the zebrafish as a model for drug screening, the actual number of screens reported is small, involving limited numbers of compounds, with analysis performed manually (Peterson et al.,2000,2004; Burns et al.,2005; Murphey et al.,2006; North et al.,2007; Tran et al.,2007; Sachidanandan et al.,2008; Yu et al.,2008). A key challenge has been the automated assessment of phenotypes because there are few image analysis methods capable of capturing the complexity of a whole organism. Current methods analyze images based on pixel information and thresholding. These methods perform well on clearly resolved objects against a uniform background but often struggle with continuum images that possess information at multiple size scales and heterogeneity across images. This was exemplified by a recent report where zebrafish vasculature became quantifiable only after manual identification and masking of the trunk (Tran et al.,2007).
We explored the use of a method designed to translate human knowledge into automated analysis routines. The approach, termed Cognition Network Technology (CNT), is an object-oriented image analysis method that models human cognitive processes. CNT originated as a tool for analyzing satellite images and recently has been applied to histopathology and magnetic resonance imaging (Biberthaler et al.,2003; Schönmeyer et al.,2006). The premise of CNT is that human visual perception classifies objects through context-sensitive associations. To emulate these processes in a computing environment, CNT uses a self-organizing, semantic, self-similar hierarchical network of objects representing specific regions in the image. The network contains information about objects, their properties and relations, as well as processing knowledge about how to react to varying input (Binning et al.,2002). Taught by a human expert, procedural attachments (algorithms) transform an initially unstructured input into a hierarchical network of objects. This stepwise procedure of alternating classification and segmentation is called a ruleset. During the execution of the ruleset, a hierarchical, networked structure of the image evolves to produce objects and relationships, with attributes attached to both. The creation of objects and their relationships on and across different hierarchical levels is equivalent to transforming information into knowledge. The approach is implemented in the Definiens Developer software, which provides a high-level programming language environment to create CNT rulesets.
In this report, we have established methodology for automated live imaging and analysis of compound-treated zebrafish embryos. This was accomplished by combining multi-well format imaging with artificial intelligence-based image analysis. The methodology is knowledge-based, quantitative, not limited to a particular phenotype, independent of image formats, and capable of measuring changes in fine structure not quantifiable by the human eye. Therefore, integration of automated imaging with cognitive image analysis has eliminated a major bottleneck for whole organism screening.
RESULTS AND DISCUSSION
Automated Imaging of Fluorescent Embryos and Development of a CNT Ruleset
We developed a CNT ruleset for Tg(fli1:EGFP)y1, a transgenic zebrafish line that expresses enhanced green fluorescent protein (EGFP) under the control of the fli1 promoter (Lawson and Weinstein,2002), to automatically detect and assign domains of biological relevance from fluorescence micrographs. Images of 48 hours postfertilization (hpf) embryos were acquired on a Cellomics ArrayScan II high-content reader. The ArrayScan II integrates advanced optics, fluorescence detection, scanning hardware, and computer control in a single platform. Fluorescence is detected by a high resolution, high sensitivity digital camera, and images are acquired for processing, analysis, and storage. The ArrayScan II is equipped with an automated moveable platform that allows the camera to focus on the embryo in each well to capture the images. The instrument was customized for whole embryo screening by installation of a nonstandard, low magnification objective, and optimization of plate form factors. These changes enabled automated capture of an entire well of a 96-well plate per imaging field. Using a training set of images acquired on the ArrayScan II, we created a ruleset to assemble a hierarchical network of objects and their relationships through iterative loops of locally specific segmentation and classification. From an original micrograph (Fig. 1A), the ruleset was taught to successively detect a general outline (Fig. 1B–D), the whole embryo (Fig. 1E,F), yolk, large trunk and head vessels (Fig. 1G), and the dorsal region (Fig. 1H). The assignment of lower hierarchy subdomains within the embryo permitted the specific detection of intersegmental vessels (ISV) and the dorsal longitudinal anastomotic vessel (DLAV; Isogai et al.,2003) in the dorsal tail without interference from surrounding areas (Fig. 1I). A synopsis of the ruleset design and critical decision points within the network is presented in Supp. Figure S1, which is available online.
CNT Ruleset Quantifies Changes in ISV at Two Distinct Stages of Development and in Small Molecule-Treated Embryos
We first used the ruleset to generate quantifiable measures of Tg(fli1:EGFP)y1 ISV formation comparing two developmental stages (Fig. 2). ISV started to form at 26 hpf and were completely developed and connected to the DLAV at 48 hpf (Isogai et al.,2003). Because CNT identifies complex zebrafish embryonic structures regardless of orientation, embryos could be randomly arrayed and automatically analyzed in multi-well plates. The ruleset detected significant, quantifiable differences between the two developmental stages, measuring either mean length or total area of the ISV (Fig. 2E,F).
We then evaluated ruleset performance using a known small molecule inhibitor of angiogenesis. Embryos (24 hpf) were treated for 24 hr with vehicle or various concentrations of SU4312, a known antiangiogenic VEGF receptor antagonist (Kendall et al.,1999; Molina et al.,2007; Tran et al.,2007), removed from chorions, and transferred to a 96-well imaging plate. The plate was scanned on the ArrayScan II and analyzed with the ruleset using the Cellenger application module, which permits data analysis and management in multi-well plate format. Embryos presented in a wide variety of orientations (Fig. 3A). Based on the ruleset's ability to measure general attributes of embryo morphology (Fig. 3B), we first eliminated or tagged empty wells (A12), plate loading (A01) and toxicity (E09) artifacts, and embryos that presented in a dorsal orientation (B07) (Fig. 3C). In the remaining wells, which contained intact embryos in lateral orientation, the ruleset reliably detected blood vessels in the dorsal tail in both vehicle and SU4312-treated embryos (Figs. 3A, 4E–H). Compared with vehicle control, the ISV in SU4312-treated embryos were reduced in number and length (Fig. 4A–D). The ruleset quantified these differences and delivered graded responses for ISV area, length, and shape (Fig. 4I–L). Concentrations of SU4312 that caused half-maximal inhibition of ISV development (IC50) were the same across all parameters (Fig. 4I–L) and consistent with published results (Molina et al.,2007). The analysis was highly reproducible. In three independent experiments, SU4312 inhibited ISV formation with average IC 50 values of 3.9 ± 1.6 μM, 3.5 ± 1.3 μM, 4.1 ± 1.5 μM, and 4.6 ± 1.5 μM for ISV area, relative area, length, and shape, respectively. Significantly, the ruleset detected changes in ISV development at or around the IC50 (5 μM), which were difficult to assign by visual inspection (Fig. 4C,G). Thus, the CNT analysis eliminated possible observation bias.
Antiangiogenic Activity of Microtubule Perturbing Agents
We next used the ruleset to analyze ISV formation in embryos treated with agents mechanistically distinct from VEGF receptor antagonists. Because there is evidence that microtubule-perturbing agents can be antiangiogenic (Belotti et al.,1996; Belleri et al.,2005), we applied the ruleset to Tg(fli1:EGFP)y1 embryos treated with 2-methoxy estradiol (2-OMe E2), a microtubule perturbing agent that inhibited ISV formation in another model of zebrafish angiogenesis (Tran et al.,2007). We also evaluated (−)-pironetin (Fig. 5), a natural product microtubule destabilizer that was obtained by total synthesis (Shen et al.,2006) and whose antiangiogenic properties have not been investigated. Both 2-OMe E2 and (−)-pironetin inhibited ISV formation in Tg(fli1:EGFP)y1 embryos. The ruleset measured the antiangiogenic activity of both agents in a graded manner (Fig. 5). The IC50 values calculated from three independent experiments were 7.0 μM for 2-OMe E2 and 0.75 μM for (−)-pironetin.
Performance of the CNT Ruleset Under Small Molecule Screening Conditions
Finally, we wanted to determine the ability of the ruleset to consistently identify active compounds under conditions potentially used for screening. The magnitude of response seen in the concentration-response experiments (Figs. 4, 5) suggested that multiple determinations would be needed to reliably detect positive compounds from randomly arrayed microplates, consistent with statistical considerations put forth in a recent review article (Malo et al.,2006). To investigate the minimum number of replicates needed to select positives with screening-compatible statistical significance, four 96-well plates were loaded with Tg(fli1:EGFP)y1 embryos as shown in Supp. Table S1. Each plate contained eight dimethyl sulfoxide (DMSO) -treated embryos (negative control) and eight SU4312-treated embryos (positive control) in columns 1 and 12, respectively. Three additional SU4312-treated embryos were placed in arbitrarily chosen but identical wells (B2, C8, and F6) on each of the four plates as test positives (i.e., active compounds). The remaining wells received DMSO-treated embryos (Supp. Table S1). Plates were imaged on the ArrayScan II and analyzed with the ruleset for ISV formation. The ruleset eliminated all empty wells, loading artifacts (e.g., double loading, contamination), and embryos in dorsal orientation, and reported well average values for ISV length, shape, area and relative area for all of the remaining wells. We found that the relative ISV area (i.e., the percent of dorsal area that was occupied by ISV) was the most robust parameter because it minimized the influence of size differences between embryos. Each data point was then transformed into a Z-score based on the distribution of all of the wells on a plate, except controls. The Z-score indicates how many standard deviations a data point is away from the population mean (Brideau et al.,2003) and has been used as an active criterion in HTS with high variability and small assay windows (Johnston et al.,2007). As expected, each individual plate contained a combination of empty wells, embryos in dorsal view, false positives, and false negatives, randomly distributed among both controls and unknowns (Supp. Table S1). In addition, the ruleset occasionally improperly assigned regions within embryos that presented in a lateral orientation. On average, the aggregate of all of these randomly distributed errors amounted to four wells per plate (4.5%). As expected (Malo et al.,2006), with increased numbers of replicates random errors averaged out, plate statistics tightened, true positives and true negatives emerged, and all of the wells became useable. As little as two replicates sufficed to eliminate all loading artifacts and embryos in a dorsal orientation (data not shown). To investigate the number of replicates needed to unambiguously detect true positives, active compounds were defined as data points that were more than 3 standard deviations below the average of all wells, except controls (i.e., Z-score < −3). We found that four replicates were needed to identify all three positive test compounds with this level of statistical significance. Figure 6 shows the Z-scores from the average of four replicates. There is clear separation between the positive (purple data points) and negative (yellow data points) controls, and all positive test embryos (blue data points with red circle) were correctly detected (Fig. 6). These data provide a justification for an experimental design involving multiple replicates in cases where assays are inherently noisy due to biological variability, and provide guidance for designing whole organism screens.
We have used knowledge-based image analysis to develop a system for automated phenotyping of fluorescent zebrafish embryos. The algorithm detected embryos regardless of orientation, partitioned them into regions of biological relevance, and quantified the growth of ISV in a specific region of the embryo. The integration of intelligent image analysis with automated image capture enabled the assessment of embryos in multi-well plates, eliminated observation bias present in visual scoring, delivered graded responses, and documented antiangiogenic activity of a small molecule previously not known to inhibit angiogenesis. The methodology is platform independent and can be applied to other zebrafish phenotypes. The results demonstrate that it is feasible to adapt image-based high-content screening methodology to measure complex phenotypes in whole organisms. We propose that knowledge-based rulesets will be useful to eliminate image analysis as an impediment to whole organism screening.
Zebrafish Husbandry and Chemical Treatment
The Tg(fli1:EGFP)y1 line was obtained from Dr. Brant Weinstein and maintained as described (Molina et al.,2007). Single transgenic zebrafish embryos (24 hpf) were treated for 24 hr in 200 μl E3 medium (5 mM NaCl, 0.33 mM CaCl2, 0.17 mM KCl, 0.33 mM MgSO4) containing vehicle (0.5% DMSO) or test agents: SU4312 (Sigma), 2-methoxy estradiol (Sigma) and (−)-pironetin (Shen et al.,2006). After manual removal of the chorions, single embryos were transferred to a 96-well half area plate (Greiner, Monroe, NC) containing 40 μg/ml MS222 (Tricaine methanesulfonate, Sigma) in E3 for imaging. Plates were briefly shaken on an orbital shaker and centrifuged at 1,000 × g for 2 min before image capture to dislodge embryos from the edge of the wells, to collect embryos and media at the bottom of the wells and to encourage a lateral embryo orientation. The optimal centrifugation speed at which imaging artifacts were kept to a minimum while fully maintaining 48 hpf embryo integrity was 1,000 × g.
Image Capture and Analysis
Images of fluorescent zebrafish embryos were acquired on an ArrayScan II high-content reader (Cellomics, Inc., Pittsburgh, PA) using an Omega XF100 filter set at excitation/emission wavelengths of 494/519 nm, respectively. The instrument was customized for whole embryo screening by installation of a nonstandard ×1.25 magnification objective (Olympus). A form factor was generated for the half-area microplate and refined using the Position Calibration Tool on the ArrayScan II so that the objective was precisely positioned in the center of each well. A z-plane offset of 200 μm above the plate bottom was set to compensate for the thickness of the specimen. Together with the large focal depth of the ×1.25 objective these modifications eliminated the need for automated focusing of individual wells and enabled automated capture of an entire well of the 96-well plate per imaging field.
CNT Ruleset Design and Validation
Using an initial training set of five untreated 48 hpf embryos, a strategy was formulated to detect the whole embryo and biologically relevant subdomains therein, and implemented using the Definiens Developer Software package (Definiens AG, Germany). Developer provides a high-level programming language to generate CNT rulesets. The software provides a multitude of preconfigured procedural attachments (algorithms) and measurement functions that enable the user to classify objects based on features and relationships. While the development of rulesets is primarily driven by human input based on visual observation and measurements of generic object properties such as intensity, variability, and shape, the software translates this input into a structured network by tracking object classifications, properties and relationships. The following section outlines the strategy used to detect and quantify ISV in the dorsal tail; a diagrammatic view can be found in Supp. Figure S1. An original, archived digital image from the ArrayScan II was subjected to quadtree-based segmentation. Regions of high variability were selected, merged, and fused with background areas that were fully enclosed by regions of high variability. Smaller objects of high variability were eliminated based on area measurements such that only one object remained as a general region of interest containing the zebrafish embryo. The embryo was then identified by multi-resolution segmentation of this general region of interest followed by elimination of objects that shared a large relative border with background. Having isolated the whole embryo from surrounding areas, regions of biological relevance could then be assigned within the embryo. Large vessels and head structures were identified based on relative brightness compared with overall embryo intensity. Small, isolated bright objects were eliminated and the remaining regions merged, re-segmented, expanded, and smoothed until the head, dorsal aorta, and posterior cardinal vein were found as contiguous regions of high fluorescent intensity that extended along the lateral axis of the embryo. Using head and trunk vessels as reference points, the yolk was defined as a uniformly dim object in proximity to the head. A lateral–dorsal region seed object was then defined based on positioning relative to yolk and head, and expanded along the large trunk vessels identified earlier. Following re-segmentation of the dorsal area, potential ISVs were then assigned as objects exceeding a threshold based on average dorsal area brightness. Objects that did not have similar neighboring objects were reassigned as dorsal area, and the remaining ISV seed objects were expanded into “true” ISV by fusion with and expansion into neighboring objects of similar brightness. The ruleset was then tested and refined on a larger set of images until the embryo and subdomains were correctly detected in all embryos that presented in lateral orientation.
Data Generation and Analysis
Images were exported from the Cellomics Store database into Definiens Developer using the Cellenger application module. The Cellenger application recognizes image formats and file structures for all of the major high-content screening platforms and, thus, permits archiving and processing of data in multi-plate formats. Images were batch processed with the ruleset. The average processing time was 15 sec per image. Total embryo size and intensity measurements were used to identify dead embryos, plate-loading artifacts, and potentially autofluorescent compounds. Wells that contained no embryos, very small embryos, or wells in which no lateral–dorsal region could be detected were tagged and eliminated. For the remaining wells, the ruleset provided numerical measurements of ISV development (area, length, and shape). Area and length are in pixels. Shape was measured by the elliptic fit parameter. The elliptic fit of an object can assume any value between 0 and 1, with 1 being a smooth, unstructured object and 0 a highly structured, irregularly shaped object. Relative ISV area was defined as ISV area/dorsal area. Data were exported to Prism (Graph Pad Software) for statistical analysis and graphical representation. Single-point data were analyzed by two-tailed Student's t-test assuming unequal variances. Concentration-dependence experiments were analyzed by one-way analysis of variance followed by Dunnett's post-test to compare each data point with vehicle control.
We thank G. Molina and H. Codore for assistance with zebrafish maintenance and breeding. N.A.H., M.T., A.V., and J.S.L. were funded by the US National Institutes of Health A.V. and J.S.L. were also funded by the Fiske Drug Discovery Fund. All authors, except A.C., declare that they have no competing financial interests. A.C. was an employee of Definiens, Inc.