## Introduction

Following the dynamic behavior of every cell at every point in time and space throughout the development of entire complex organisms is one of the central goals of developmental biology (Megason & Fraser 2007; Keller *et al*. 2008; Khairy & Keller 2011; Tomer *et al*. 2012). Comprehensive reconstructions of cellular dynamics and cell lineage information are indispensable for systematically dissecting functional relationships in the developmental building plan, understanding the morphological development of complex tissues and entire organisms, quantitatively and comparatively analyzing mutant phenotypes, correlating gene expression and cell fate decisions, testing biophysical models of the physical forces acting in development and, ultimately, formulating and testing models of the entire developing embryo. In the long-term perspective, the systematic reconstruction and correlation of cell lineage information for individuals of the same species as well as across species borders may furthermore provide key insights into the fundamental quantitative rules underlying developmental building plans.

In order to realize the automated reconstruction of cellular dynamics, however, a combination of key advances in *in vivo* fluorescence light microscopy, computational image processing, image data management and data visualization are needed to generate and efficiently analyze the large amount of information required for system-level studies of development. Figure 1 shows a generic pipeline for such experiments and analyses (please see Box 1 for a definition of technical terms). Briefly, the specimen is recorded *in vivo* for the maximum duration possible without causing damage to the fluorescent markers (owing to photo-bleaching) or to the specimen itself (owing to photo-toxic effects). Achieving good physical coverage, spatial resolution and temporal sampling is crucial to reliably capture and resolve cell migration and cell division events across the entire embryo. In the resulting datasets, cell boundaries and/or locations need to be identified for every cell in the embryo and at every time point (segmentation) and associated with the correct object in the next time point (tracking). Since complex multicellular organisms typically comprise many tens of thousands of cells already in early developmental stages, automated computational approaches are required to perform these tasks and extract quantitative information from the recorded images that can then be analyzed for new biological insights.

### Box 1. Summary of technical terms

**Point spread function:** mathematical description of the image formed by a microscope when the observed object can be considered a single point in space. The point spread function (PSF) characterizes the resolution of the microscope.

**Dwell time:** time interval, over which the microscope detects and integrates signal in the currently recorded volume element. For example, in point-scanning confocal or two-photon microscopy, the dwell time corresponds to the amount of time the laser beam illuminates the focal volume corresponding to a single pixel in the final image, before moving on to the next volume element. Longer dwell times lead to higher signal-to-noise ratio, but also reduce speed and increase photo-damage.

**Multi-photon fluorescence microscopy:** optical microscopy technique that uses a non-linear fluorescence excitation mode to achieve optical sectioning as well as deeper penetration into biological tissues.

**Structured illumination:** optical microscopy technique that uses patterns of light for specimen illumination. Two common types include incoherent structured illumination, which allows enhancing image contrast in light-scattering samples, and coherent structured illumination, which can be used to increase resolution beyond the diffraction limit.

**Bessel beam:** beam with a central peak surrounded by a concentric ring system. The central peak of the Bessel beam is thinner than the Gaussian beam typically used in light-sheet microscopy. When suppressing the contribution of the Bessel beam's ring system to the recorded image, for example, by multi-photon excitation or structured illumination, a Bessel beam light-sheet microscope can achieve higher axial resolution than a conventional light-sheet microscope.

**Image registration:** computational task of aligning two or more images with respect to each other by finding common features.

**Deconvolution:** computational task of correcting for the blurring effect arising from the point spread function of the microscope.

**Segmentation:** computational task of associating pixels in the same image that represent the same object.

**Tracking:** computational task of associating pixels or objects across different time points.

**Parametric shape:** mathematical description of the shape of an object based on an analytical formulation with few parameters. For example, an ellipsoid is a parametric shape in 3D with nine parameters.

**Non-parametric shape:** mathematical description of the shape of an object based on an exhaustive instead of an analytical formulation. For example, enumerating all the voxels in an image that belong to an ellipsoid.

**Contour:** type of non-parametric shape for 2D closed objects, such as cell membranes and nuclei. In this case, the user enumerates all the points along the boundary of the object in order to describe it.

**Energy function:** in image processing, this refers to a mathematical equation to model the problem at hand. The extremum (minimum or maximum) of this equation should correspond to the correct biological solution.

**Level sets:** non-parametric shape representation. The shape is described by all the points equal to a given value (usually zero) of a mathematical function, which allows great flexibility with respect to the type of shapes that can be represented.

**Active contours or snakes:** computational technique for fitting contours to images using an energy minimization approach.

**Image feature:** any information that can be extracted from the image and that can be represented by a single real number in order to compare its value in different regions of interest.

**Machine learning classifier:** mathematical function that, based on some input such as image features, predicts the correct output for a given task. For example, deciding if a cell is dividing or not. The function has free parameters that can be adjusted using examples given by the user (training data), effectively learning to model the given task.

**Support vector machine:** specific type of machine learning classifier, which has become very popular due to its ease of use and its applicability to a large spectrum of problems.

In the following sections, we will discuss in more detail state-of-the-art approaches for each of these steps, from image acquisition to image analysis and data visualization. Although there are still many challenges (Keller *et al*. 2008; McMahon *et al*. 2008; Olivier *et al*. 2010), we argue that complete cell lineage reconstructions in complex multicellular organisms are within reach in coming years.