Instrumenting and tuning dataView—a networked application for navigating through large scientific datasets



dataView is an application that allows scientists to fly visually through large, regularly-gridded, time-varying 3D datasets from their desktop computers. dataView works with data that has been divided into cubes and sub-cubes (which we call ‘tiles’ and ‘subtiles’), sampled at three levels of detail and written to a terabyte data server built on a PC cluster. dataView is a networked application. The dataView client component that runs on the scientist's computer is used only for user interaction and rendering. The selection of data subtiles for any given scene, and the geometry computation performed on those subtiles to create the virtual world, are performed by dataView components run in parallel on nodes of the PC cluster.

This paper describes how we instrumented and tuned the code for improved performance in a networked environment. We report on how we measured network performance, first by inducing network delay and then by running the dataView client component in Washington DC and the compute components in Los Angeles. We report on the effect that tile size, level of detail, and client CPU speed have on performance. We analyze what happens when the geometry computation is performed in parallel using MPI (Message Passing Interface) vs. in serial, and discuss the effect on performance of adding additional computational nodes. Copyright © 2001 John Wiley & Sons, Ltd.