We propose a new image-space technique to summarize a 3D animation sequence into a single image by using the depth information of the animation. The proposed method extracts important frames from the animation sequence, where the important frames are representative of the sequence and keep the complexity of a composed image as simple as possible. Assuming that the input sequence consists of a set of images with depth information, we construct a composite depth image and its gradient image. We evaluate the importance of each frame by its amount of contribution to the gradient of the composite depth image. The frames of higher importance are located more likely at which the motion of a moving object reaches the extreme positions, the fastest speed, and the slowest speed in image space. From the most important frames to the least ones, we recursively compose the important frames into a single composite image while keeping the complexity of the composite image by evaluating the amount of self-overlap. The threshold value for the amount of overlap allows a user to control interactively the visual complexity of the composed image. Copyright © 2012 John Wiley & Sons, Ltd.